make4ht icon indicating copy to clipboard operation
make4ht copied to clipboard

ODT mismatch among sections, document outline, and table of contents

Open jmclawson opened this issue 3 years ago • 1 comments

Expected behavior:

The document outlines and tables of contents in ODT files will match those of PDF files.

Actual behavior:

The document outlines and table of contents in ODT files are established by \section commands.

Longer description:

When converting into ODT, the resulting document map seems to be determined by the \section (and \subsection, etc) commands, without consideration for asterisked versions of these commands (\section*, etc.) and without considering commands like \addcontentsline.

Here's a MWE:

\documentclass[12pt]{article}
\usepackage{hyperref}

\begin{document}
\tableofcontents
	
\section{First Section}
\subsection{Subsection}
\section*{Second Section}
\section*{Third Section}\addcontentsline{toc}{section}{Third Section}
Bibliography\addcontentsline{toc}{section}{Bibliography}

\end{document}

The PDF output, with its hyperref-provided document outline on the left, looks like this: Screen Shot 2022-03-04 at 9 13 18 AM

Notice that the Second Section is correctly missing from both the Table of Contents and the document outline because it was added with \section*. Additionally, the Third Section and the Bibliography are in the contents and the document outline because these headings were added with \addcontentsline.

When converting to ODT using the bash command make4ht -f odt mwe.tex, the resulting file looks like this in Microsoft Word: Screen Shot 2022-03-04 at 9 15 54 AM

Notice that the Second Section is incorrectly included in the document outline and Bibliography is missing from it, while the TOC matches the PDF output. When I right click the TOC and choose Update Field, I get this: Screen Shot 2022-03-04 at 9 18 01 AM

Here, the TOC now matches the incorrect document outline. They both incorrectly include Section Two, which was defined using \section*, and they both incorrectly omit Bibliography, which should have been added with \addcontentsline.

P.S. It's likely I should be filing this with TeX4ht instead. I'm still learning where the division is between the two projects.

jmclawson avatar Mar 04 '22 15:03 jmclawson

Hi James,

sorry for the late reply, I somehow missed this report and I found it now.

I am afraid that this is something that is quite difficult to solve. The problem is that information in the TOC generated by TeX4ht, and information in the document outline comes from two separate sources.

The original table of contents in the ODT file comes from the TOC file. So it correctly omits \section*, including entries added using \addcontentsline.

The document outline is created automatically by Word from sections used in the document. So it includes also \section* commands, but Bibliography is missing because it is just plain text in the document. When you do Update Field, Word will regenerate TOC using this outline.

I am afraid that we cannot fix this.

michal-h21 avatar Apr 04 '22 12:04 michal-h21