hy
hy copied to clipboard
Looks like `replace` does unnecessary work
I think I'm seeing some unnecessary work going on: HyList.replace
calls replace_hy_obj
, which in turn can wrap_value
. But the result of replace_hy_obj
is discarded and thus, depending on how a particular expression is constructed, its possible wrap_value
is evaluated 3 or 4 times on the same object.
Fixing HyList.replace
so it instead self mutates:
def replace(self, other):
self[:] = [replace_hy_obj(x, other) for x in self]
HyObject.replace(self, other)
return self
Wrapping only happens once now. But this seems to have some consequences with using iterables in unquote. The following expression now does this:
=> (defn numbers [] (yield 1) (yield 2) (yield 3))
=> (macroexpand `(do ~(numbers)))
HyExpression([
HySymbol('do'),
HyList([
HyInteger(1),
HyInteger(2),
HyInteger(3)])])
instead of
=> (defn numbers [] (yield 1) (yield 2) (yield 3))
=> (macroexpand `(do ~(numbers)))
HyExpression([
HySymbol('do'),
<generator object numbers at 0x7f2d2da38db0>])
(which can't be compiled: TypeError: Don't know how to wrap a <class 'generator'> object to a HyObject)
It also means unquote-splice
's compiler logic can be simplified a bit as well.
Are there any other concerns or thoughts on the topic before I dig further into this? I couldn't find any other open issues around this, but maybe I just don't know what to look for?
Using a mutable data structure for Hy code is a bit dangerous, but convenient in macros. If the compiler has to look at the same code object more than once, but mutates it during the compilation process, that could cause bugs. There's nothing stopping a macro from returning multiple references to the same HyExpression
object. The compiler must not break code it hasn't gotten to yet.
Ideally, we'd be using immutable data structures like Clojure does, so we wouldn't have to worry about it. But relying on Pyrsistent or something would add another dependency. And if we used a simple singly-linked list like most other Lisps, we'd lose most of the nice list methods that Python programmers are used to. That means we need to be very careful about mutations in the compiler, and I'm already not certain if we've tested this well enough.
I think making replace
return a new object is likely going to work well enough. We can be a bit more aggressive with __slots__
, and maybe also consider backing HyList
with tuple
instead of list
- that should also be a slight performance improvement.
You shouldn't see any loops in the data structures, so the Python reference counter should be just fine.
The mutation was a quick hack to see what would happen. For all I knew the everything would have outright blown up.
One thing to note: compile times are pretty slow to begin with, so we'd have to be pretty careful using immutable data structures, and definitely not linked lists.
Yeah, but immutable should be faster, not because immutable datastructures are faster, but because using pop to iterate through a list is slow:
In [1]: def consumelist1(x):
...: while x:
...: x.pop(0)
...:
In [2]: def consumelist2(x):
...: for _ in x:
...: pass
...:
In [3]: %timeit consumelist1(list(range(100)))
21.2 µs ± 1.13 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [4]: %timeit consumelist2(list(range(100)))
2.24 µs ± 332 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
I suspect making HyExpression an immutable tuple, and dealing with the occasional mutability when needed will perform a lot faster than constantly mutating lists.
Some initial work I've done seems promising like it might improve compile times, but I haven't quite got Hy working fully yet.
Where we need lookahead, we can built it into the iterator instead: https://github.com/erikrose/more-itertools/blob/master/more_itertools/more.py#L135 (we could always borrow the code too instead of adding another dependency)
Using a mutable data structure for Hy code is a bit dangerous
In fact, it created hylang/hy#1537, and probably a lot of other subtle bugs we haven't found yet.
using pop to iterate through a list is slow
See hylang/hy#1406.
we'd have to be pretty careful using immutable data structures, and definitely not linked lists.
Performance claims need actual measurements. I don't know where the compiler bottlenecks are. We could try different approaches and profile, but changing the Hy model interface seems like a lot of work because it's a pretty deep change.
Linked lists seem to work fine for other Lisps. But maybe they have some kind of background optimizations for that kind of thing. Something like the Clojure seq abstraction might be an option. We could back it with any iterable for performance and still treat it like an immutable linked list by caching the returned values. hylang/hyrule#32 Hy also has a cons. We'd be able to integrate that better with seq.
But maybe inheriting from tuple is good enough. In macros, you'd usually use the quasiquote syntax anyway. But if you did want to construct code using a listcomp or something, you could use a normal list and still convert it afterwards to the tuple form using the HyExpression
constructor. It's a little less convenient, but a lot less error prone than the status quo.
Now it looks like we always use the return value of hy.as_model
(née wrap_value
), and the sequential types like List
are immutable anyway.