django-dag icon indicating copy to clipboard operation
django-dag copied to clipboard

query optimization

Open jsykora opened this issue 14 years ago • 6 comments
trafficstars

This isn't a bug or problem per se- but while doing some query optimizations in my project, I realized that the descendants_tree() (and I assume ancestor_tree and ans_set/des_set) command(s) uses a large number of queries to complete. I'm wondering if there is an easy way to cut down the number of queries for the nested structure. I had something in mind like a select_related() query that could at least cut down the number of db hits. For now, the rest of my project is too important to overlook, so it may be a while until I can get back to it and work on a pull request. Thanks for this project!

jsykora avatar Jun 13 '11 20:06 jsykora

I agree, there could be a problem with large datasets and deep trees.

I'm not going to fix it right now, but using a different approach (something like MPTT [1] does) might worth considering.

[1] https://github.com/django-mptt/django-mptt/

elpaso avatar Jun 14 '11 06:06 elpaso

Alright. I'll keep you updated if I come up with any improvements. This project is working great for my needs, thanks for your work.

jsykora avatar Jun 14 '11 13:06 jsykora

http://www.postgresql.org/docs/8.4/static/queries-with.html is an approach which would be very useful for this, and shouldn't be too hard to adapt for. However, it's not going to be portable -- MySQL and Sqlite don't support this query unless I'm mistaken.

I'm about to begin work on this. Given the non-portability of this, I'm thinking of forking this and renaming it (django-cte-dag?) so that django-dag users who need to deal with deep graphs can benefit from Postgresql's recursive queries, while django-dag can continue to serve users of other databases with simpler needs.

It's worth noting that for small to medium-sized graphs you could use transitive closures, and there's a library out there called django-directed-acyclic-graph which does that.

hyperair avatar Nov 27 '12 07:11 hyperair

Hi @elpaso, thanks for this package. Solves my use case with slight modifications. I want it to be scalable upto ~100K node and edges. I just read your comment -

I'm not going to fix it right now, but using a different approach (something like MPTT [1] does) might worth considering.

I am curious to know more about your approach.

Also, did anyone else work on optimizing queries?( I'm using MySQL database.)

a1Gupta avatar Apr 08 '17 12:04 a1Gupta

@a1Gupta Hi, I'm not actively working on this project anymore, but I'll be happy to review any PR, provided that it applies to general use cases of the library.

elpaso avatar Apr 10 '17 06:04 elpaso

@elpaso Hi, I am trying to find an efficient way to do it. I've added a post on reddit for the same. You can also add your ideas there.

a1Gupta avatar Apr 17 '17 03:04 a1Gupta