acmart icon indicating copy to clipboard operation
acmart copied to clipboard

Switch on UTF8 encoding

Open dbeyer opened this issue 6 years ago • 17 comments

Since ACMART is internationally used, by many people with accents in author names and affiliations, it might be a good idea to switch on uft8 input encoding by default, in order to relieve the authors from one additional config line in each of their main.tex files. Is it possible to include the following in the acmart class file? \RequirePackage[utf8]{inputenc}

dbeyer avatar Sep 20 '17 08:09 dbeyer

Or even

  \RequirePackage[utf8]{inputenx}
  \input{ix-utf8enc.dfu}

krono avatar Sep 20 '17 08:09 krono

Adding this line will make life marginally better for the authors who use UTF-8 in input. However, it will make life tremendously more hard for the authors who do not: for example, just adding an epigraph in Greek and using LGR.

borisveytsman avatar Sep 21 '17 02:09 borisveytsman

Isn't LGR a font encoding, not an input encoding?

(I have a class where precisely because of that issue I use this:

\RequirePackage[LGR,OT1,LY1,T1]{fontenc}
\RequirePackage[utf8]{inputenx}
\input{ix-utf8enc.dfu}
\RequirePackage{alphabeta}

)

krono avatar Sep 21 '17 05:09 krono

Yes, you are right

Ok suppose you want a Russian citation and use koi8.

borisveytsman avatar Sep 21 '17 14:09 borisveytsman

hm, is the Russian support so bad with UTF8 in LaTeX?

krono avatar Sep 21 '17 14:09 krono

Many people use koi-8 and cp1251 by tradition.

borisveytsman avatar Sep 21 '17 14:09 borisveytsman

\documentclass[sigconf,russian,english]{acmart}

\RequirePackage[LGR,OT1,LY1,T2A,T1]{fontenc}
\RequirePackage[utf8]{inputenx}
\input{ix-utf8enc.dfu}
\RequirePackage{alphabeta}
\usepackage{babel}
\begin{document}

\foreignlanguage{russian}{я не знаю}
\end{document}

this seems to work for short parts

krono avatar Sep 21 '17 14:09 krono

Also, shouldn't we advocate to only use utf-8? I persnoally think it is worthwile.

krono avatar Sep 21 '17 14:09 krono

We have thousands of authors. I am not sure we are in a position to forcefully advocate input encodings: do we want a long holy war among some passionate ones? I would rather give as much freedom to authors as I can.

borisveytsman avatar Sep 21 '17 14:09 borisveytsman

That's true. But with this reasoning, acmart also would have to consider XeTeX and LuaTeX, or even Biblatex instead of BibTeX, right?

krono avatar Sep 21 '17 15:09 krono

absolutely. This is in my plans

borisveytsman avatar Sep 21 '17 15:09 borisveytsman

Ping me if I can help :)

krono avatar Sep 21 '17 15:09 krono

Thanks!

borisveytsman avatar Sep 21 '17 15:09 borisveytsman

Please note that neither inputenc nor inputenx sets the space factor of Unicode close curly quotes correctly. (\sfcode’ is 0, but \sfcode” is 1000. Both should be zero.) Therefore, if you do take the Unicode input plunge, please also add

% inputenc doesn't set the space factor of Unicode close curly quote
% correctly.  (In current versions, the \sfcode of ’ is 0, but the
% \sfcode of ” is still 1000.  Both should be zero.)
\AtBeginDocument{%
  \sfcode\csname\encodingdefault\string\textquotedblright\endcsname=0%
  \sfcode\csname\encodingdefault\string\textquoteright\endcsname=0%
}

to an appropriate place in the class file.

zackw avatar Aug 10 '18 22:08 zackw

Should we rather press upstream authors (L3 team for inputenx) to make changes rather than working around bugs? Could you write to [email protected]?

borisveytsman avatar Aug 12 '18 20:08 borisveytsman

I wrote them a note and cc:ed you.

zackw avatar Aug 14 '18 15:08 zackw

Is this question still relevant since utf-8 encoding is the default in LaTeX since 2018 (see https://tug.org/TUGboat/tb39-1/tb121ltnews28.pdf) ?

rionda avatar Oct 25 '21 14:10 rionda