Change the type of word from str to tuple[str]
In DFA, method "words_of_length" returns a generator for all words.
In my use case, given a DFA, I want to be able to list all the individual labels making up a word, but labels are somewhat complex strings and not single characters. For example: "wrap(1,0)=4/ok" + "decrypt(1,4)=6/ok".
AFAIK, this is not currently supported, because the codebase treats a word as a conjunction of labels and hence as a str, instead of tuple[str], a list[str] or a more general iterator of strings.
Should this perhaps be reworked?
@andreastedile can you give more detail on what you have in mind exactly? From the description, I'm not quite sure how the API would change based on what you have listed. Would this require the ability to process arbitrary strings the same way we process single characters in the library right now?
Sure. I recognize my post seemed a bit cryptic. Let me make a direct example.
Consider the DFA example in the docs. If I do:
for word in my_dfa.words_of_length(4):
print(word)
it will print:
0001
0101
0111
1001
1101
Now, let me apply the following modification. Rename input symbol '0' to 'first' and input symbol '1' to 'second'. It will print:
firstfirstfirstsecond
firstsecondfirstsecond
firstsecondsecondsecond
secondfirstfirstsecond
secondsecondfirstsecond
The issue is that all labels composing the word are merged. That is, 'firstfirstfirstsecond' is really composed of 'first' 'first' 'first' 'second', but I can't list all these individual labels, unless I unpack the word with a regex, which is cumbersome.
The API I expect should allow something like this:
for word in my_dfa.words_of_length(4):
for label in word:
print(word, end=" ")
print()
first first first second
first second first second
first second second second
second first first second
second second first second
The issue stems from the fact that the codebase treats a word as a str, instead of tuple[str], a list[str] or a more general iterator of strings (i.e., labels).
@andreastedile Thanks for sharing this! @eliotwrobson what are your thoughts?
@caleb531 I think this could potentially be useful, but tricky to implement. The use of strings as a sequence of individual symbols is an assumption that's baked pretty deeply into the library. I'd be happy to review a PR adding this functionality, but I'm not sure if I could implement this myself.