icu icon indicating copy to clipboard operation
icu copied to clipboard

ICU-22848 Add test for rbbi rule builder failure.

Open FrankYFTang opened this issue 1 year ago • 4 comments

Checklist
  • [X] Required: Issue filed: https://unicode-org.atlassian.net/browse/ICU-22848
  • [X] Required: The PR title must be prefixed with a JIRA Issue number.
  • [X] Required: The PR description must include the link to the Jira Issue, for example by completing the URL in the first checklist item
  • [X] Required: Each commit message must be prefixed with a JIRA Issue number.
  • [ ] Issue accepted (done by Technical Committee after discussion)
  • [X] Tests included, if applicable
  • [ ] API docs and/or User Guide docs changed or added, if applicable

FrankYFTang avatar Aug 07 '24 04:08 FrankYFTang

@aheninger could you take a look at the test case I added. It is now hang. I think it is super slow

FrankYFTang avatar Aug 07 '24 04:08 FrankYFTang

Looks like the state table builder is running in exponential time on the number of trailing dots in the test pattern, ".*X..................;"

This is deep in the guts of the dragon book state table construction algorithm. I doubt there is an easy fix - the algorithm is tricky and hard to fully understand.

I recommend just leaving this as a known issue. It's not a problem in actual real break rules.

aheninger avatar Aug 07 '24 19:08 aheninger

Notice: the branch changed across the force-push!

  • icu4c/source/test/intltest/rbbitst.cpp is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

Notice: the branch changed across the force-push!

  • icu4c/source/test/intltest/rbbitst.cpp is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot