omegat
omegat copied to clipboard
feat: update tmx writer, escaper, and unit test
Pull request type
- Feature enhancement -> [enhancement]
Which ticket is resolved?
- TMX writer output as same escape as OmegaT 6.0
- https://sourceforge.net/p/omegat/feature-requests/1749/
What does this PR change?
- update TMX test expectation for level2 write
- use double quote for attribute
- format indentation and empty line
- Add TmxEscapingWriterFactory class and unit test with ordinary string, "< & >", Surrogate Pair and invalid one.
- Use TmxEscapingWriterFactory in TMXWriter2 for TEXT writer
Other information
com.sun.xml.bind.marshaller#escape
which is ultimately OmegaT 6.0 and before use is as follows;
public void escape(char[] ch, int start, int length, boolean isAttVal, Writer out) throws IOException {
int limit = start + length;
for(int i = start; i < limit; ++i) {
char c = ch[i];
if (c == '&' || c == '<' || c == '>' || c == '\r' || c == '\n' && isAttVal || c == '"' && isAttVal) {
if (i != start) {
out.write(ch, start, i - start);
}
start = i + 1;
switch (ch[i]) {
case '\n':
case '\r':
out.write("&#");
out.write(Integer.toString(c));
out.write(59);
break;
case '"':
out.write(""");
break;
case '&':
out.write("&");
break;
case '<':
out.write("<");
break;
case '>':
out.write(">");
break;
default:
throw new IllegalArgumentException("Cannot escape: '" + c + "'");
}
}
}
if (start != limit) {
out.write(ch, start, limit - start);
}
}