OpusTrainer
OpusTrainer copied to clipboard
Add a cleaning rule for URL names, such as Amazon.com -> Amazon.it
Similar to #736, we should discard sentences that translate the domain suffix of a website, like Amazon.com -> Amazon.it
With a regex such as /[a-z]+\.com\b/ we could identify a URL on the English side, and ensure that it's matched on the other language.
Sorry for the noise, once again I posted this in the wrong repo, lol.