Parsr
                                
                                
                                
                                    Parsr copied to clipboard
                            
                            
                            
                        Ghostscript library not found on MacOS bare metal installation of Parsr 1.2.2
Summary
On MacOS 11.6.4, using Parsr 1.2.2, the camelot package cannot find the Ghostscript library installed by macports. I have good reason to believe that homebrew has the same problem.
Steps To Reproduce
- Perform a bare metal install on MacOS 11.x or later.
 - Run the UI application using 
npm run start:api. - Surf to the appropriate URL, upload a document, accept the default settings (i.e., run all steps), press Submit.
 - Consult the Web service log, and look for 
TableDetectionScript.py. 
Expected behavior
You should find this in the log:
Table detection succeed
Actual behavior
You'll find this in the log:
    raise OSError(
OSError: Ghostscript is not installed. You can install it using the instructions here: https://camelot-py.readthedocs.io/en/master/user/install-deps.html
Environment
- Parser 1.2.2 release
 - Current macports, node
 - 16-inch 2019 Macbook Pro
 - MacOS Big Sur 11.6.4
 
Additional context
This bug is raised in the camelot package, as it looks for the Ghostscript library, named gs. It calls ctypes.util.find_library("gs"), which searches in a number of locations, including those indicated by the environment variables DYLD_LIBRARY_PATH and DYLD_FALLBACK_LIBRARY_PATH. There are two problems.
First, on MacOS, the Parsr installation documents recommend using homebrew to install the Ghostscript package, and my colleagues tell me that homebrew does not set any of the relevant environment variables. Neither does macports. The PATH is updated, but not the library path variables. So, in the default situation, camelot won't have a chance to find these libraries.
Second, something in the overall npm invocation blocks the percolation of these libraries when they are set. So doing this:
$ DYLD_FALLBACK_LIBRARY_PATH=/opt/local/lib npm run start:api
doesn't work either; by the time the environment reaches the detectTables() function in CommandExecuter.ts, the environment variable is already undefined. What might be going on here is that on MacOS, the System Integrity Protection will prevent the DYLD_ environment variables from being percolated through various sensitive calls, one of which is sh. So if a shell is wrapped around a command line invocation, then the environment variable is lost. Here's an illustration in Python:
>>> import os, subprocess
>>> os.environ["DYLD_FALLBACK_LIBRARY_PATH"] = "/opt/local/lib"
>>> import ctypes.util
>>> ctypes.util.find_library("gs")
'/opt/local/lib/libgs.dylib'
>>> subprocess.run(["python", "-c", "import ctypes.util; print(ctypes.util.find_library('gs'))"])
/opt/local/lib/libgs.dylib
>>> subprocess.run("python -c \"import ctypes.util; print(ctypes.util.find_library('gs'))\"", shell=True)
None
However, the problem is not that the subcommand invoked within detectTables() is invoking a shell and stripping the variable; the problem is farther up, since the environment variable is already stripped when detectTables() is invoked. I suspect that somehow, the cascade of commands that npm causes to happen is invoking a shell and causing the environment to be stripped.
If I set the environment variable directly in detectTables() before the Python command is run, and rerun the npm command, the problem goes away.