Even more speed
#1269 brought great speedups for hy. I have one more metric that I don't understand
$ echo >test.hy
$ hy test.hy # to byte-compile
$ time hy test.hy
hy test.hy 0.05s user 0.01s system 98% cpu 0.061 total
$ python __pycache__/test.cpython-35.pyc # to cache
$ time python __pycache__/test.cpython-35.pyc
python __pycache__/test.cpython-35.pyc 0.01s user 0.00s system 71% cpu 0.011 total
Why is there still difference? Can it be cut down to the same value?
I see similar times for a file like
import hy
import test
So I guess loading hy takes 50ms here? Importing only hy.importer produces about the same results. Is there a way to strip these 50ms? Once test.hy is byte-compiled there is really no need to load the whole runtime, right?
I've noticed this, but haven't looked into it closely. It seems that order to get that last bit of speed, we need to delay importing various parts of Hy until we need them, which I'm afraid will be annoying to implement, but since I haven't tried, that's only a guess. It might be worth it, anyway.
At the same time, I'm more concerned about the speed of one-liners (hy -c '(print (+ 1 1))') than starting a program in a file, since we can't byte-compile a one-liner we haven't seen yet. I'm hoping that caching the lexer or something will help with this.
I think part of the general problem is simply that Hy is written in Python. Therefore, it is physically impossible for Hy's startup to be just as fast as Python's. When you start Hy, this happens:
- Python starts. There goes 0.01s.
- All the libraries are imported. Even when compiled to bytecode, this can take a bit, since some of the libraries can have bigger-ish import chains.
- The bytecode for the program has to be checked (to make sure it's valid) and loaded. You've got mtime comparisons and the like here.
- The bytecode has to be run. Along with that, it will bring in the Hy core libraries, whose bytecode also needs to be checked and run.
So the times really aren't too far off.
We might end up going the other way with this. See #1324.
In hindsight, this seems a little silly. A 40-ms overhead for importing Hy seems acceptable and not something that can feasibly be improved on very much. Neither Hy nor Python can be expected to produce programs with startup times that are comparable to native code, so if you want to write a program that will be launched thousands of times per second, this is the wrong tool for the job.
You are right. On the other hand if you take much longer to boot than pure Python then by definition you are disqualified from cases where a pure Python script was still a valid choice.
I disagree; in fact, I can't think of a scenario where a 10 ms startup time would be good enough but not 40 ms. Besides, real Python code will be importing other modules and thus won't obtain 10 ms, either.
If startup takes 10ms and your script runs another 10ms, the total runtime is 20ms. With hy it'll be 60ms, 3x slower. Putting it in a shell loop pipeline might no longer be an option.
What I take from your point of view is that you don't care for such a rare use case, and that's reasonable.