CIM icon indicating copy to clipboard operation
CIM copied to clipboard

Rewrite parse options clean

Open guicho271828 opened this issue 10 years ago • 19 comments

major deferrences:

  1. Re-implementation of parse-options, based on weakly-functional style.
  2. In production mode, the lisp codes are concatenated into one file by lib/build.sh . In debugging and testing environment, they are loaded by asdf.
  3. Unit testing (on lisp. using fiveam)
  4. Unit testing (on shell script. using shunit2)
  5. Safety -- avoid open and use with-open-file etc.
  6. Readme and documentation. More specification on the environment of loading and evaluating a lisp code.

basic functions like cl -e (print 1) now works.

TODOs:

  1. I have to test the lisp scripts in $SIM_HOME/script/ such as ql_cmd_deps.
  2. Test codes for shell-scripts are still insufficient.

guicho271828 avatar Jun 14 '14 08:06 guicho271828

Thank you for such big pull request. I have many thing to say. First of all, "Refactor without tests is mere destructure". In this case, it's all my responsibility not having written tests. However, I have to mention some points.

  • In cim.asd

    • The License is BSD, not LLGPL.
  • In mains.lisp

    • *raw-argv* is missing. Though it is not exported, it is a part of cim. Also, it is important it is provided in the form of a variable, not a function.

    • at line 22, why do you signals repl-entered? At least, what doing here is not running repl but a execution like

      cl < script.lisp

    • this read-expressions-from-stdin doesn't treat EOF (this is original CIM's bug, have fixed in master).

  • In option-parser.lisp

    • help message is specialized to cl but parse-options is used at other places (e.g. ql_cmd_deps) so it shouldn't be. Now you may understand why the variable generated-help is provided.
    • hook style is also specialized to cl. If you want to do like that, declare hooks where option-parser is used.
    • to pass an arg to short option, both form like -iEXT(no spaces between -i and EXT) and -i EXT(have space) should be allowed. The reason is related to optional argument and incompatibility between each OSes' interface.
    • why do you refactor parse-options? Current parse-options is enough stable and efficient. What is the reason why you want to refactor with making above degression?
  • In packages.lisp

    • left the first line blank because this file comes the top of script.lisp after you run ./build.sh and ECL ignores the first line regardless of whether the first line contains shebang or not.
    • cl of CIM is not a library but a thin compatibility layer so separating cim and cim.impl is of little use because users don't touch cim package. If you provide a library of functions for compatibility, use cim.ext (existed in the past, but now is obsolete).
    • Contrastively, cim.repl is needed to provide (in future) users way to customize repl. *history* and such stuffs are provided for that purpose.
  • In process-args.lisp

    • though I'm not sure if you intended to or not, the behavior of -e (--eval) is changed. Current implementation allows such forms like -e '(write-line "Hello, ' -e $USER -e '")'. I myself think this behavior is funny and if you intended to change it, it is OK.
    • about the behavor of -e (--eval), I think resetting *package* for each sexp is overdone. If you are caring about -i, wait a moment. Because I noticed correctly treating -i in common lisp is very hard (current implementation have bugs, I noticed), I'm planning to make -i obsolete.
    • about the behavor of -p (--package), have you cared about what if both --package and --repl(or [programfile]) are given?
    • the behavior of -v is changed. It should not be. If you want to provide a option to set verbosity, use --verbose n
  • In repl.lisp

    • REPL's default package should be common-lisp-user, not cim.
  • In res/cim_installer

    • What the purpose of putting this file? If this file is a document for developers, write comment in script/cim_installer instead. If for user, this is of little use because user doen't execute script/cim_installer directly.
  • In scripts/cim_installer

    • don't use git. CIM should not depend on non-posix commands as far as possible. If you want to get archive from github, use codeload.github.com
    • I'm assuming this change is for your developping. If you want to provide users a way to select repositories, change cim get. When you do it, you can refer RVM's rvm get
  • Finally, this patch slows down cl's start up time +0.3s. Because cl is developed for easy use from shell, cl can be called over and over (e.g.

    find ./ -name '*.txt' -exec cl script.lisp '{}' ';'

executes cl as many times as the occurance of .txt file below the current directory, which can be over thouthands) so start up time counts.

The rest of changes are very greatful but points mentioned above are concerned.

KeenS avatar Jun 14 '14 20:06 KeenS

ok, I read the comments. I will quickly fix those probelms. Thanks!

guicho271828 avatar Jun 15 '14 02:06 guicho271828

Regarding those comments, since there were no test scripts before, it was as if none of them were specified. At least they are only vaguely stated in README, but it is informal (informal in a sense versus machine-readable). That's the reason why we have to adopt TDD or BDD or whatsoever.

For each of those specifications (like -e should append its arguments and the lisp should run the concatenated string through), we should add a test for it. Only after doing so should the behavior be statically and formally specified.

guicho271828 avatar Jun 15 '14 03:06 guicho271828

I'll make up the issues threads for each of those comments, in my own repo.

guicho271828 avatar Jun 15 '14 03:06 guicho271828

I make some more suggestions and comments.

Firstly, I guess --version is less used, while modifying the verbosity is common among many utilities. Also, adding something like "-v -v -v" and "-vvv" to increase the verbosity is common (for example lsusb). It will be useful if this handling is provided by default. I agree with adding --verbose n option, but I add both --verbose n and -v.

Secondly,

why do you refactor parse-options? Current parse-options is enough stable and efficient.
What is the reason why you want to refactor with making above degression?

Since parse-options was so big, have many features at once and very very stateful, we cannot unit-test each feature separately. Only the integration testing is possible. Also, since we had no test scripts before -- I mean reproducable, public and static test codes -- I didn't find such codes in your repo -- , I or any other person can assure the library is well-tested, other than trusting you or man-power testing. Personally speaking, only under the situation where comprehensive tests are provided and many people runs the test script in their environment, the library can boast its well-testedness.

Thirdly,

don't use git. CIM should not depend on non-posix commands as far as possible. If you want to get archive from github, use codeload.github.com

Modification to cim_installer is for the debugging purpose only. see shell-test/install.sh. This script gets the archive from the local repository (that is ../), unpack it in shell-test/cimtest and so on. CIM of course should not depends on the other libraries on normal use, however I think it is reasonable to do so during the debugging. It is just like using asdf while testing.

guicho271828 avatar Jun 15 '14 03:06 guicho271828

  • About verbosity I see your opinion but I stick to current implementation. How about -V for verbosity? I'm caring about cases that someone expecting cl -v shows version types cl -v suddenly gets verbose outputs and finaly it looks hunged up (actually, reading input). It is better than that for someone expecting cl -v incremets verbosity and simply get version printed. If you want furthure discussion, another issue is desireble.
    • About rewriting option-parser Acturely, I didin't consider testability when I wrote parse-option. Now I'm grateful for your work.
  • About using git I didn't know git have such a feature. This also boosts testability. Thanks a lot.

KeenS avatar Jun 20 '14 06:06 KeenS

thanks, now I'm back in japan.

-V :+1:

Regarding concatenated -e option:

I think I have two ideas about it. 1st one is an implementation idea that does not rely on the explicit string concatenation seen in the previous code (which prevents the evaluations under distinctly separated environment). read-from-string signals eof-error when the parsing fails, so we can catch it and handle it.

Another one is providing -E option. but I'm not sure what behavior we can define on it.

guicho271828 avatar Jul 01 '14 12:07 guicho271828

hmm, it might be odd to see Capicalized verbosity options like -VVV, but well, that can be funny.

guicho271828 avatar Jul 01 '14 12:07 guicho271828

I'm not sure *raw-argv* as a variable is a good idea, because it is not immutable. Once changed, it no longer guarantee it is truely raw. How about *raw-argv* as a symbol-macro to (get-raw-argv)? In a user's point of view they are not so different (though there might be a negligeble performance drawback). Also, if we make get-raw-argv cache its output, then the drawback is zero.

guicho271828 avatar Jul 05 '14 17:07 guicho271828

Now to summarise:

Fixed:

  • The License is BSD, not LLGPL.
  • *raw-argv*
  • repl-entered -> read-from-stdin
  • read-from-stdin treats EOF
  • (not-related) "$CIM_HOME/init.lisp" is loaded under protected package
  • ECL ignores the first line: already fine (81e42edcebfe7f1e36a7ed193152aca2b84cc077)
  • cim.repl

If you are caring about -i, wait a moment. Because I noticed correctly treating -i in common lisp is very hard (current implementation have bugs, I noticed), I'm planning to make -i obsolete

I think -i is a good idea, not to be dismissed. Please don't make it obsolete. Rather, write a shortest code that reproduce the same error, then turn it into a test script.

  • REPL's default package should be common-lisp-user, not cim -- now REPL is run under a protected package (specified with -p).
  • verbosity option (done as you specified)

guicho271828 avatar Jul 05 '14 23:07 guicho271828

Fixed:

  • generated-help

Not fixed yet / discussion remains:

  • hook style is also specialized to cl. -- not really, (at least now on)
  • -iEXT and -i EXT
  • separating cim, cim.impl, cim.ext
  • res/cim_installer -- just added in case.
  • scripts/cim_installer git XXX.git -- debugging (I guess this discussion is already closed)
    • cim get / rvm get
  • benchmarks -- discussed in another comments.

guicho271828 avatar Jul 05 '14 23:07 guicho271828

Regarding benchmarks, I wish If I had a benchmark script. I admit that the current implementation is not so optimized, but just telling me how you've measured the loading speed is very helpful.

Regarding how we can reduce the loading time, I once thought about making a core in which CIM is already loaded. There are already some libraries in this line of work. (trivial-dump-core etc.) see https://github.com/guicho271828/CIM/blob/rewrite-parse-options-clean/save-image.org . As it turned out, it requires too much work.

However I have another & easier & ANSI compliant method. Why not just compile-file script.lisp? In compile-file, lisp files are compiled into files with arbitrary name. http://www.lispworks.com/documentation/HyperSpec/Body/f_cmp_fi.htm

guicho271828 avatar Jul 06 '14 00:07 guicho271828

Thank you a lot of great works.

About *raw-argv*: I think mutability is useful for paticular case such as emulation in case someone would write CIM plugin. I know it's rare case. How about leaving *raw-argv* unexported and exporting get-raw-argv?

About benchmark and related: I simply compared time sbcl --eval '(quit)' and time cl --eval '(exit)'. Compiling file is what I once considered but gave up with caring about the conflict of compiled name. In that time, I didn't know compile-file has output-file option. Now that it's turned out you can name as you like, it is possible choice. However, managing compiled files of each impls of each version needs some amount of work. Do you think it worth to work?

KeenS avatar Jul 06 '14 06:07 KeenS

just done. I used (getenv "LISP_IMPL") to manage fasl files from different implementation.

The result is, super speedup!

./benchmark.sh 
...
total runtime of 5 runs:
w/ quicklisp

1 
2 
3 
4 
5 
real    0m5.228s
user    0m4.857s
sys 0m0.361s
w/o quicklisp with --no-init option

1 
2 
3 
4 
5 
real    0m1.653s
user    0m1.416s
sys 0m0.221s
w/o quicklisp, script.lisp compiled
cl --no-init -c cimtest/lib/script.lisp -Q
; compiling file "/mnt/video/guicho/repos/lisp/CIM/shell-test/cimtest/lib/script.lisp" (written 06 JUL 2014 07:57:53 PM):
; compiling (IN-PACKAGE :CL-USER)
; compiling (DEFPACKAGE CIM.IMPL ...)

...

; compiling (IN-PACKAGE :CIM)
; compiling (MAIN); 
; compilation unit finished
;   caught 6 STYLE-WARNING conditions
;   printed 2 notes


; /mnt/video/guicho/repos/lisp/CIM/shell-test/cimtest/lib/script.sbcl-system written
; compilation finished in 0:00:00.428

1 
2 
3 
4 
5 
real    0m0.190s
user    0m0.098s
sys 0m0.099s

The problem might be sbcl-system.

guicho271828 avatar Jul 06 '14 11:07 guicho271828

I ensured the benchmark works correctly on sbcl, ccl, clisp.

guicho271828 avatar Jul 06 '14 12:07 guicho271828

Now, the basic testing facility is fully set up. Just go to shell-test and run make all_impls. CIM Installation, implementation, lisp codes, everythings are tested automatically.

guicho271828 avatar Jul 06 '14 13:07 guicho271828

Finally, it's now able to merge.

guicho271828 avatar Jul 06 '14 15:07 guicho271828

TODO:

  • on ECL, compiled script.lisp does not work because of a0534a7fc7a0ce4c2cb31f7510483dcb9ed8c7e5 , I guess. ECL says it failed to find asdf.

guicho271828 avatar Jul 06 '14 15:07 guicho271828

I'm trying to merge this patch but it's very challenging. Could you split the patch into 1 feature/PR and resolve conflicts? Tests and README correction can be immediately accepted. And with the patch gotten smaller, I can correct the rest problems.

KeenS avatar Jul 15 '14 15:07 KeenS