ISAcreator
ISAcreator copied to clipboard
Does not ensure that ISA-Tabs are in utf-8
(as discussed face-to-face on 2017-02-05)
What originated as a bug in our data repository, https://github.com/hidelab/genometranslationcommons/issues/14, turns out to be because the ISA-Tab created by ISAcreator wasn't encoded in utf-8 and that turned out to be because Java on Windows doesn't particularly guarantee that.
Solutions that would improve the matter:
- The Java program could override the system locale and just write everything in utf-8.
- Warn when saving if the encoding isn't utf-8.
- Selectable from a menu.
I think I strongly prefer 1 (are there users that want ISA-Tab files that aren't utf-8?).
It's possible that we could achieve that by setting an option when java
is invoked.
My brief research leads me to https://groups.google.com/forum/#!topic/isaforum/03P91ZQ1mj0 which suggests -Dfile.encoding=utf-8
.
But I don't know how Java programs are packaged or launched.
on Windows you can write a shell script to start up ISACreator using UTF-8 encodings. Here's the one I wrote:
echo. Starting ISAcreator java -Dfile.encoding=utf-8 -Xmx1024m -Xms512m -jar ISAcreator.jar
ah thanks @DanBerrios, I don't use Windows myself but this will be useful
an example unicode character: ≤
as in 5 ≤ 5