translate-shell
translate-shell copied to clipboard
Split and rejoin long lines
I need to use translate-shell to translate some thousand files, but some of them are supposed to have very long lines. I was thinking that the best way to handle this is splitting and later rejoining them.
So, I'd like to ask you if it is possible somehow to implement this in translate-shell, so other users might benefit from it!
- You can, of course, use the Unix
fmt
command to do that. This is the easiest way, but you lose the accuracy for translation on complete sentences. - Splitting natural language is not generally easy. Different languages have different punctuation rules to indicate the end of a sentence. I'm not going to implement that right now, but hopefully translate-shell will have some support for NLP in the future.
- Just a kind reminder - if you have that thousands files to translate, the official Google Translate API is the best to use. It has well-documented REST APIs, and bears less network limits for your client. Besides, a REST API means you can do things in your language of choice for considerable efficiency, and awk is very, very slow :snail:
OK, thank you so much for your help!