Spec should specify an encoding
The spec should include the expected encoding of a proper todo file. At the moment a client cannot properly open a todo.txt file created on a system with a different encoding.
There are two ways to fix this:
- Include encoding metadata in the todo.txt file
- Specify UTF-8
My preference would definitely be number 2.
My vote is UTF-8.
This would solve the line ending problems as well.
For line endings I would allow \r \n and \r\n.
My vote is "UTF-8 with BOM".
Because "UTF-8" some times have problem with encoding detection (and incorrect definitions like ANSI (and damage to characters)).
I am working in a project for reference management. We store bibliographic data in BibTeX files, which are plain-text sort-of-key-value files.
We also stored the encoding information at the beginning of the file. We came to the conclusion that UTF-8 is widely adopted and thus, we support UTF-8 only. Most users should be able to use recode (or Notepad++ and alike) to change the encoding.
IMHO BOM should not be enforced. A todo.txt file is still a text file which should be editable by any text editor and versionable by any version control system.
BOM is really only needed for broken windows tools (such as Excel) but it would mean any todo.txt file is not a valid ASCII file anymore (even when only characters in the ASCII range are used).