ebib icon indicating copy to clipboard operation
ebib copied to clipboard

Alignment problems for the index buffer

Open xiangsheng opened this issue 2 years ago • 20 comments

As @joostkremers suggests, I open a new issue for this alignment bug.

I collect some screenshots to illustrate this bug.

Screenshot_20220121_162635

Screenshot_20220121_230930

Screenshot_20220121_230538

Screenshot_20220121_230418

I think the bug is not caused by LaTeX markup. Many titles with LaTeX markup align properly. What is common in above wrongly aligned cases is that all the titles contain ' or apostrophe.

Screenshot_20220121_230657

Moreover, if the title contains Chinese characters, columns are also aligned wrongly.

xiangsheng avatar Jan 21 '22 15:01 xiangsheng

Thanks for opening a separate issue.

I think the bug is not caused by LaTeX markup. Many titles with LaTeX markup align properly. What is common in above wrongly aligned cases is that all the titles contain ' or apostrophe.

What happens if you remove the apostrophe from the user option ebib-TeX-markup-replace-alist?

Moreover, if the title contains Chinese characters, columns are also aligned wrongly.

Has that always been the case or did that start recently?

The function that is used to calculate the column width, truncate-string-to-width, doesn't work well if there are composed characters in the text to be displayed. If that is the case with Chinese characters, then that may explain it.

Not sure if that would be easy to fix, though.

joostkremers avatar Jan 22 '22 19:01 joostkremers

What happens if you remove the apostrophe from the user option ebib-TeX-markup-replace-alist?

If I set the variable in this way,

(setq ebib-TeX-markup-replace-alist '(("{" . "")
                                      ("}" . "")
                                      ;; ("``" . "“")
                                      ;; ("`" . "‘")
                                      ;; ("''" . "”")
                                      ;; ("'" . "’")
                                      ))

the alignment bug related to quotation and apostrophe seems gone.

Has that always been the case or did that start recently?

Yes, I suspect this bug also exists in previous versions. But I'm not very sure.

xiangsheng avatar Jan 23 '22 09:01 xiangsheng

the alignment bug related to quotation and apostrophe

Unfortunately, I'm not able to reproduce this. Titles with apostrophes are aligned fine in my setup. What font are you using exactly?

[misalignment of Chinese titles]

Yes, I suspect this bug also exists in previous versions. But I'm not very sure.

Could you paste in a bibtex entry with a Chinese title here so I can test this? I might need a more sophisticated mechanism to calculate column widths if I want to be able to deal with fonts that are more complex than plain Latin.

joostkremers avatar Feb 19 '22 09:02 joostkremers

Unfortunately, I'm not able to reproduce this. Titles with apostrophes are aligned fine in my setup. What font are you using exactly?

Yes, this bug disappears when I test it using a minimal emacs settings. But I suspect this bug has nothing to do with the font I use. After some investigation, I find a weird thing. If a string contains some non-ascii characters, the format function does not calculate the length of a string properly in my current emacs settings.

(format (concat "%50s") "Boutet de Monvel's calculus and groupoids I")
(format (concat "%50s") "Boutet de Monvel’s calculus and groupoids I") 

In above example, ' is replaced with . In theory, two output strings should have the same length and this is what happens for a minimal setting. But in my current setting, the second output string is shorter than the first one. It seems is omitted by format.

The alignment bug appears in my setting is caused for the same reason. I'm quite confused why my setting can affect the behavior of format, which is a C-coded function.

Could you paste in a bibtex entry with a Chinese title here so I can test this?

@Book{Qi_1994aa,
	author = {Qi, Minyou and Xu, Chaojiang and Wang, Weike},
	timestamp = {2017-08-31 06:56:26 +0000},
	publisher = {武汉大学出版社},
	title = {{现代偏微分方程引论}},
	year = {1994}
}

I hope this example may be of some help.

xiangsheng avatar Feb 20 '22 14:02 xiangsheng

It seems that the alignment problem is caused by the locale setting of my system. Emacs inherits the locale setting of the system. In my case, it means that the value of current-language-environment is set to Chinese-GBK. When changing the value of the variable to English, the alignment problem disappears.

xiangsheng avatar Feb 21 '22 12:02 xiangsheng

Hmm, that's strange. I wouldn't expect the locale to make such a difference. I'd like to ask about this on the Emacs mailing list to see if this is a known fact.

joostkremers avatar Feb 21 '22 21:02 joostkremers

I wanted to post a message to help-gnu-emacs, but realised there are two things I need to know before I do:

(format (concat "%50s") "Boutet de Monvel's calculus and groupoids I")
(format (concat "%50s") "Boutet de Monvel’s calculus and groupoids I") 

In above example, ' is replaced with . In theory, two output strings should have the same length and this is what happens for a minimal setting. But in my current setting, the second output string is shorter than the first one. It seems is omitted by format.

I assume you have also looked at the output of truncate-string-to-width in both cases, and it's really format that's causing the problem?

Could you paste in a bibtex entry with a Chinese title here so I can test this?

@Book{Qi_1994aa,
	author = {Qi, Minyou and Xu, Chaojiang and Wang, Weike},
	timestamp = {2017-08-31 06:56:26 +0000},
	publisher = {武汉大学出版社},
	title = {{现代偏微分方程引论}},
	year = {1994}
}

Thanks. Have you tried to see if the misalignment with Chinese text also disappears when using English as your language environment? I don't see any misalignment when I add this entry to my .bib file.

Also, what Emacs version are you using?

joostkremers avatar Feb 22 '22 08:02 joostkremers

My emacs version is 27.2 and the os is Manjaro.

I assume you have also looked at the output of truncate-string-to-width in both cases, and it's really format that's causing the problem?

Yes. I have checked the output of truncate-string-to-width. They work as expected.

Have you tried to see if the misalignment with Chinese text also disappears when using English as your language environment?

Even if I set English as the language environment, I can still see the misalignment. In a minimal setting, it looks likes this. Screenshot_20220222_172616

Whether misalignment appears or not may be related to the font you use. I post the font info in the above screenshot.

xiangsheng avatar Feb 22 '22 09:02 xiangsheng

I just tried the above-pasted entry in my own .bib file and I found it misaligned.

The way I have ebib set up though, the only Chinese characters displayed are the title. When I edited the title to not use Chinese characters, the problem went away. Not sure what to take from that but there it is.

It's worth saying thought that I have heavy configuration, and I do all sorts of strange things to my index display (including messing around with pixel alignment for icons) so my setup is probably not a great indicator.

Hugo-Heagren avatar Feb 22 '22 11:02 Hugo-Heagren

Whether misalignment appears or not may be related to the font you use. I post the font info in the above screenshot.

Yes, that does indeed seem to be the case. On my Windows setup, I do see misalignment with the Chinese title, and the font used there is indeed different from my Linux setup.

That means that the misalignment with Chinese titles is a different issue from the misalignment with apostrophes. Misalignments due to the font used are expected if the font is proportional. There may still be a good way to deal with them, so I'll try to find some information on that topic, but it's a different question.

joostkremers avatar Feb 23 '22 11:02 joostkremers

There may still be a good way to deal with them, so I'll try to find some information on that topic, but it's a different question.

Not really a solution, but might be useful for you to know that emacs 29.1 and on have a function string-pixel-width, which does exactly what it sounds like. Earlier emacs has shr-string-pixel-width, which does the same thing, but requires the shr library.

In my config, I display symbols in the index if the entry has a file, note or link. I use an icon font for the icons, but switch these to letters if the console is s text terminal, so I need a robust function for printing things aligned properly. I use this, might be useful inspiration:

(defun my/ebib-display-file-status (field key db)
  (if (ebib-get-field-value "file" key db 'noerror 'unbraced 'xref)
	(propertize my/bib-file-icon
		    'mouse-face 'highlight
		    'button t
		    'follow-link t
		    'category t
		    'keymap button-map
		    'action 'ebib-view-file)
    (let ((width (string-pixel-width my/bib-file-icon)))
	;; This has to be only 1 character! (as does the max width of
	;; the file index column)
	(propertize " " 'display `(space . (:width (,width)))))))

(add-to-list 'ebib-field-transformation-functions '("File" . my/ebib-display-file-status))

Hugo-Heagren avatar Feb 23 '22 20:02 Hugo-Heagren

Not really a solution, but might be useful for you to know that emacs 29.1 and on have a function string-pixel-width, which does exactly what it sounds like. Earlier emacs has shr-string-pixel-width, which does the same thing, but requires the shr library.

Ah, that's good to know. The relevant code in shr.el isn't very long, so it would be possible to copy it over to Ebib and adapt it.

[...]

Thanks for the code example. I'm not sure when I'll be able to get to this, but it'll be a good starting point.

joostkremers avatar Feb 23 '22 20:02 joostkremers

@xiangsheng, can you tell which font are you using for the affected character (`’')? More generally, do you have some customizations of the Emacs fontsets, and if so, can you show those customizations?

Emacs by default assumes that in CJK locales certain characters are displayed by CJK fonts, and those fonts routinely use full-width glyphs for those characters. Notable examples include Cyrillic and Greek characters, and also some punctuation characters. It sounds like at least in your case that assumption is not true, so I would like to better understand your setup with respect to the fonts you are using.

Eli-Zaretskii avatar Feb 26 '22 07:02 Eli-Zaretskii

@Eli-Zaretskii, thanks for reply. In the following setting,

emacs -Q
C-x RET l Chinese-GBK RET

the misalignment appears with the default font Source Code Pro. I also test the font Dejavu Sans Mono, the problem also appears.

But using the font Noto Sans Mono CJK SC, the thing is a little different. In the scratch buffer, the two lines have different length visually. Screenshot_20220226_181427 By evaluating these two lines, the outputs in the message buffer is aligned. Screenshot_20220226_181627

xiangsheng avatar Feb 26 '22 10:02 xiangsheng

Is it reasonable to use Source Code Pro in a CJK locale? Does it have good-enough support for Chinese characters? Or are Chinese characters displayed by a different font (and if so, which font is that?)

(You can tell which font is used for a character by going to that character and typing C-u C-x =.)

Eli-Zaretskii avatar Feb 26 '22 10:02 Eli-Zaretskii

Is it reasonable to use Source Code Pro in a CJK locale?

This the default font for emacs in my OS, Manjaro, although I have no idea why this font is chosen.

Does it have good-enough support for Chinese characters? Or are Chinese characters displayed by a different font (and if so, which font is that?)

It seems Chinese characters are displayed in a right way by using another font Noto Serif CJK TC. Screenshot_20220226_220012

But at the same time, although the charset of is also classified as chinese-gbk, it is displayed using Source Code Pro. Screenshot_20220226_215514

xiangsheng avatar Feb 26 '22 14:02 xiangsheng

Thanks. Please try the same, but this time set use-default-font-for-symbols to nil immediately after starting Emacs. Does this cause the character be displayed using Noto Serif CJK TC, and if so, does that also solve the alignment problem in your original scenario?

Eli-Zaretskii avatar Feb 26 '22 14:02 Eli-Zaretskii

After emacs -Q, I tried two ways: (a) set the language environment to chinese-gbk and then set use-default-font-for-symbols to nil; (b) set use-default-font-for-symbols to nil and then set the language environment to chinese-gbk. But two ways both don't work. is still displayed using Source Code Pro and the misalignment is still there.

xiangsheng avatar Feb 26 '22 14:02 xiangsheng

Thanks. One more question, hopefully the last one: do you expect the displayed using the Noto Serif CJK TC font or the default Source Code Pro font? What do you think Emacs users in your locale will expect?

Eli-Zaretskii avatar Feb 26 '22 15:02 Eli-Zaretskii

Since is a special Chinese punctuation, the right quotation, maybe it is more properly to display it using Noto Serif CJK TC for a Chinese locale in my opinion. But for other symbols that is not a Chinese punctuation, I think it is hard to say which fonts is more suitable.

xiangsheng avatar Feb 27 '22 03:02 xiangsheng