sphinx-simplepdf icon indicating copy to clipboard operation
sphinx-simplepdf copied to clipboard

Error due to missing anchor

Open kreuzberger opened this issue 2 years ago • 6 comments

During Build with Spinx-simplepdf i get an error on every project: ERROR: No anchor # for internal URI reference

This seems to come from the TOC anchor pointint to an empty internal reference. Expecting:

  <div aria-label="main navigation" class="sphinxsidebar" role="navigation">
   <div class="sphinxsidebarwrapper">
    <div>
     <h3>
      <a href="#">
       Table of Contents

kreuzberger avatar Oct 05 '22 08:10 kreuzberger

Yes, not having a valid anchor seem to be a problem. I sometimes get dozens of them for some projects.

I'm pretty sure Sphinx-SimplePDF is not the source of this problem, as HTML gets rendered by Sphinx. So it's Sphinx or the used weasyprint-lib, which has problems which such anchors.

But maybe we can do some HTML-postprocessing and search for such invalid anchors, before weasyprint tries to create a PDF out if it.

danwos avatar Oct 05 '22 09:10 danwos

The Warnings and errors are coming from weasyprint. E.g. the tons of css warnings, and also the error. I can reproduce it by calling weasyprint from the commandline and the generated singlehtml input file.

Question would be how to handle them. Solutions could be

  • capture stderr and show it only if an error occurs (e.g. with check=True)
  • call weasyprint with quiet option.

I think the first solution would be ok?

kreuzberger avatar Oct 05 '22 10:10 kreuzberger

Preprocessing the html to satisfy weasyprint is no option for me in this szenario

kreuzberger avatar Oct 05 '22 10:10 kreuzberger

capture stderr and show it only if an error occurs (e.g. with check=True)

I'm not sure if this is really the best option. Right now I haven't found the time to do some analysis of all the problems weasyprint is claiming about. Maybe most of them can be fixed by updating our theme and removing some unsupported CSS styles. I would like to try this first ...

danwos avatar Oct 05 '22 10:10 danwos

Most of the warnings seems to come from sphinx-needs styles. These seems e.g. valid css (like e.g. text-shadow), but due to they are not supported by weasyprint they are claimed with "WARNING: Ignored". Quieting all output seems also not so helpfull / dangerous.

So what about filtering with regular expressions? I attached a possible patch. With this patch we could "apply" a filter list to reduce the output (just ignore the patch order (-/+) a complete PR would follow if required)

diff --color -ru a/builders/simplepdf.py b/builders/simplepdf.py
--- a/builders/simplepdf.py	2023-03-27 14:13:12.406283366 +0200
+++ b/builders/simplepdf.py	2023-03-21 14:08:02.612199132 +0100
@@ -1,5 +1,4 @@
 import os
-import re
 from typing import Any, Dict
 import subprocess
 import weasyprint
@@ -122,29 +121,19 @@
 
         timeout = self.config['simplepdf_weasyprint_timeout']
 
-
-        filter_list = self.config['simplepdf_weasyprint_filter']
-        filter_pattern = '(?:% s)' % '|'.join(filter_list) if 0 < len(filter_list) else None
-
         if self.config['simplepdf_use_weasyprint_api']:
-
+            
             doc = weasyprint.HTML(index_path)
 
             doc.write_pdf(
                 target=os.path.join(self.app.outdir, f'{file_name}'),
             )
-
+        
         else:
             retries = self.config['simplepdf_weasyprint_retries']
             for n in range(1 + retries):
                 try:
-                    wp_out = subprocess.check_output(args, timeout=timeout, text=True, stderr=subprocess.STDOUT)
-
-                    for line in wp_out.splitlines():
-                        if filter_pattern is not None and re.match(filter_pattern, line):
-                            pass
-                        else:
-                            print(line)
+                    subprocess.check_output(args, timeout=timeout, text=True)
                     break
                 except subprocess.TimeoutExpired:
                     logger.warning(f"TimeoutExpired in weasyprint, retrying")
@@ -177,7 +166,6 @@
     app.add_config_value("simplepdf_theme", "simplepdf_theme", "html", types=[str])
     app.add_config_value("simplepdf_theme_options", {}, "html", types=[dict])
     app.add_config_value("simplepdf_sidebars", {'**': ["localtoc.html"]}, "html", types=[dict])
-    app.add_config_value("simplepdf_weasyprint_filter", [], "html", types=[list])
     app.add_builder(SimplePdfBuilder)
 
     return {

kreuzberger avatar Mar 27 '23 12:03 kreuzberger

Almost all errors except the missing anchors error are coming from the imported datatables.min.css.

INFO: Step 2 - Fetching and parsing CSS - file:///home/kreuzberger/src/s5/sx_spdf/sx/build/release/src/module/demod/iqdemod/doc/rtr/simplepdf/_static/sphinx-needs/libs/html/datatables.min.css
WARNING: Error: Expected <ident> for declaration name, got literal. at 13:645.
WARNING: Error: Expected <ident> for declaration name, got literal. at 13:8256.
WARNING: Error: Expected <ident> for declaration name, got literal. at 13:8848.
WARNING: Error: Expected <ident> for declaration name, got literal. at 13:12508.
WARNING: Expected a media type, got screen/**/and/**/(max-width: 767px)
WARNING: Invalid media type " screen and (max-width: 767px)" the whole @media rule was ignored at 13:13732.
WARNING: Expected a media type, got screen/**/and/**/(max-width: 640px)
WARNING: Invalid media type " screen and (max-width: 640px)" the whole @media rule was ignored at 13:13937.
WARNING: Expected a media type, got screen/**/and/**/(max-width: 640px)
WARNING: Invalid media type " screen and (max-width: 640px)" the whole @media rule was ignored at 16:8502.
WARNING: Expected a media type, got screen/**/and/**/(max-width: 767px)
WARNING: Invalid media type " screen and (max-width: 767px)" the whole @media rule was ignored at 28:3845.

It seems weasyprint claimes about "*cursor:hand" (i assume the wildcard selector) and similar properties

kreuzberger avatar Mar 28 '23 07:03 kreuzberger