juriscraper
juriscraper copied to clipboard
`ET_AL` regexp is too permissive
Given a hypothetical case Fox v. Mohammed Mat Et Aliasing
, the ET_AL
regexp:
https://github.com/freelawproject/juriscraper/blob/c007905e622d752273460347ddc5539883a13770/juriscraper/lib/string_utils.py#L265
is too permissive and thus the harmonize()
function turns it into Fox v. Mohammed Matiasing
.
While there is some justification for capturing et alia
there is not for other strings that happen to match \set al
.
Nice catch. This regex has been running live for a LONG time. I think it has test cases, but sounds like we need at least one more.