bunch icon indicating copy to clipboard operation
bunch copied to clipboard

Added a SimpleBunch C extension type for speeding the most common operations up

Open dsuch opened this issue 13 years ago • 3 comments

Hi David,

care to take a look at this pull request?

The thing is that there are some core operations, the most common ones, that I'd like to speed up. If you compile the package now and run this gist for instance

https://gist.github.com/1961505

you'll notice that for what I consider the most commonly used access patterns, which is

import bunch

d = {1:11, 2:22, 3:33}
sb = bunch.SimpleBunch(d)

sb.aa = 'bb'
'aa' in sb
del sb['aa']

well, the difference is an order of magnitude. bunch.SimpleBunch is about 10-15x faster than bunch.Bunch and some 20-30% slower than the plain dict.

There's a caveat though - the reason I'm calling it a 'simple' Bunch is that only getattr and setattr are implemented and even then, setattr assumes everything should be stored as the SimpleBunch's keys, in other words - I haven't seriously played with the whole getattribute machinery on C API level - might be a field for expansion to explore by future contributors and then it may be possible that bunch.Bunch will become a subclass of bunch.SimpleBunch, but that's future for now.

The whole thing is imported optionally, if the extension isn't available, bunch will define SimpleBunch as an alias to regular Bunch.

You might want to have a look at '_simple_bunch_systems' in setup.py and add some other systems that are likely to have a C compiler handy. This of course assumes a C compiler will be always available on any such a system but I hope this isn't that bad an assumption.

What do you think of it?

Cheers!

dsuch avatar Mar 02 '12 21:03 dsuch

OK, I've added some more to it but I'll stop at that :-)

What I did was to make the C implementation of getattr and setattr actually match that in Python. That means the code is now about 4-5 times slower than built-in dictionaries yet it's still 4-5 faster than pure-Python Bunch implementation.

I think the code can be merged in although I certainly wouldn't make it a default for now, let people use it and maybe spot things I've overlooked. Let there be more feedback.

Cheers!

dsuch avatar Mar 05 '12 13:03 dsuch

Can I put in a request that these C extensions are evaluated and perhaps incorporated? I love the code style improvements Bunch provides - especially when dealing with JSON-like structures. But some high-level speed testing using MongoDB through pymongo shows that Bunch s l o w s things down significantly.

Cheers.

DannyGoodall avatar Jul 19 '12 10:07 DannyGoodall