enca
enca copied to clipboard
Support cross-compilation
To support cross-compilation, enca’s build system must not depend on executables built for the host system. This includes the make_hash
tool and iconvcap
. Since this depends on the host iconv, it isn’t just a matter of convincing the build system to do the right thing (though even that would be slightly non-trivial with automake)—ideally, these tools should not exist in the first place.
I just had a quick look at things and it seems to me like the following happens:
-
configure
builds and runsiconvcap
to generateiconvenc.h
, which then contains a number of#define
s which don’t seem to be used directly anywhere (can’t tell for sure, just grepping things) - a make rule runs sed on that file, to generate a sed script
- that sed script is run on
encodings.dat
in the source tree - its output is passed through
make_hash
to generateencodings.h
Excuse my rudeness. Maybe there is a reason to do it this way. But as an outsider, I sure had a WTF moment just now. If possible at all, I think this should be done at runtime.
Actually I have no clue why it's implemented this way, but patches are welcome to change it.
I am trying to compile the library for arm-linux today and encountered the same issue. Looking forward to a fix. Thanks.
The real downer here is iconvcap
: if it's compiled using host toolchain, there's a good chance it won't run due to a different architecture; if it's compiled using build toolchain, the resulting libenca binary is polluted with build system's info.
Why does enca even need to know those iconv names?
It's there because different iconv implementation use different names for some encodings and support different set of encodings. Even on Linux there can be either libiconv or iconv embedded in glibc, though I'm not sure how much different these two are.
On linux, there is for example also the musl c library (http://musl-libc.org), which has its own iconv implementation. (It seems to violate one assumption of iconvcap: it can only convert from legacy encodings, but not to them.)
@nijel, AFAIU iconvcap just tries already known, hardcoded names, so why enca can't do this too? Why is it so important to figure out the exact names used in iconv if iconvcap manages fine without it?
It tries one by one possible options for every encoding, doing that on every conversion sounds pretty awkward.
Sometimes being portable is indeed awkward. Also it should be possible to cache at runtime the correct name after enca figures it out for the first time and reuse it afterwards.
Caching it could be option, though so far enca is absolutely stateless and does not store any configuration on the disk.
On the other side, there is also question whether anybody actually uses enca on systems which do use non standard encoding names in iconv...
In the end I'm not really opposed to patch which would change do_iconv_open to iterate over possible charsets when doing iconv_open. I don't think the perfromance hit would be major here and it will probably make it work more as expected...
The make_hash is compiled with host toolchain since 2393833, see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=841644
Please add CPPFLAGS_FOR_BUILD
(and possibly LDFLAGS_FOR_BUILD
) here: https://github.com/nijel/enca/commit/2393833d133a6784e57215b89e4c4c0484555985#diff-80bc8b2b3a38cd579f0c0f9fd0b338c4R6
Per make manual: 'Use CPPFLAGS in any compilation command that runs the preprocessor, and use LDFLAGS in any compilation command that does linking as well as in any direct use of ld.'
Patches welcome :-)