automata icon indicating copy to clipboard operation
automata copied to clipboard

Add `min_length` and `max_length` parameters to `successors`/`predecessors`

Open Tagl opened this issue 1 year ago • 6 comments

I think the successors/predecessors functions are quite useful for enumerating words that are accepted by the DFA. Sometimes I want to limit the length of the word in an infinite language. So far I have been limiting lengths by intersecting with DFA.of_length.

It might be a good idea to add parameters for this, as it may be far more efficient to take apply that approach than to intersect with another DFA. One additional benefit of this is it will guarantee that execution of the function ends, which is not true right now for infinite languages. We might even want to require max_length or default to some reasonable upper bound based on the size of the alphabet. I'm planning on changing the implementation myself to support this in the near future, assuming it is beneficial with regards to performance.

@caleb531 @eliotwrobson do you have any thoughts on this?

Tagl avatar Feb 07 '24 16:02 Tagl

This seems like a pretty reasonable change that shouldn't blow up the API too much. Go for it!

eliotwrobson avatar Feb 07 '24 17:02 eliotwrobson

@Tagl This is a great idea, and seems very practical. I would suggest something like an optional max_length parameter with a default value of None (meaning no limit). And min_length with a default value of 0.

@eliotwrobson What is our convention in the codebase for length-related variable/parameter names? So we use "length" or "len"?

caleb531 avatar Feb 08 '24 22:02 caleb531

@caleb531 I think we usually use "length" (since "len" shadows the built-in function), but this might be inconsistent in some places.

eliotwrobson avatar Feb 08 '24 22:02 eliotwrobson

@eliotwrobson Yeah, now that I'm at my personal machine, I performed a quick search in the codebase. I see many uses of "length", e.g.:

  • DFA.minimum_word_length()
  • DFA.maximum_word_length()
  • DFA.words_of_length()
  • DFA.count_words_of_length()
  • DFA.of_length

So far, the only deviation I can find is _populate_count_cache_up_to_len, but that is an internal method (which perhaps we should rename for consistency).

Regardless, I think our established convention seems pretty evident from the above. And as for max_length vs. max_word_length, the DFA.of_length() method omits "word" in the parameter names to give min_length and max_length, so I think the latter are what we want to use here.

caleb531 avatar Feb 08 '24 22:02 caleb531

@Tagl have you been able to take a look at this? I actually have a use case where this would be very handy, so I'd like to include it in the version we release once the current package review is complete.

eliotwrobson avatar May 15 '24 00:05 eliotwrobson

Oh I thought I'd made the PR for this already. My bad! I'll find my implementation and get a PR set up soon.

Tagl avatar May 15 '24 22:05 Tagl