clustershell icon indicating copy to clipboard operation
clustershell copied to clipboard

NodeSet parsing performances, improve and add docs

Open degremont opened this issue 7 years ago • 2 comments

When parsing a very big nodeset with thousand of nodes like

pattern = "nova1000,nova1205,nova1235,nova1001,..."
NodeSet(pattern)

is much slower (x3-x4) than

NodeSet.fromlist(pattern.split(","))

This is even worth if pattern has nodes with several dimensions!

It is true that using a list of short patterns helps CS to optimize it but I see not reason that when NodeSet does the coma-separated parsing itself, this is so much slower.

I first opened this ticket to remember we should add documentation on performance consideration for CS and especially for NodeSet which is widely use and could be slow if badly used.

This ticket will help to keep track of this particular performance issue and see if we can do better in the future.

degremont avatar Jan 08 '18 10:01 degremont

I tracked this down to a suboptimal method _next_op(). I'm working on a patch for this.

degremont avatar Jan 08 '18 11:01 degremont

Just need the NodeSet performance documentation part before closing this ticket.

degremont avatar Apr 20 '18 08:04 degremont