bbot icon indicating copy to clipboard operation
bbot copied to clipboard

unstructured module (dev) doesn't work on arch

Open liquidsec opened this issue 1 year ago • 5 comments

The unstructured module tries to install several OS dependencies that do not work on arch linux.

In several other cases, we've needed conditional logic to support arch using different package names, installing from source, etc.

the packages in question:

["libmagic-dev", "poppler-utils", "tesseract-ocr", "libreoffice", "pandoc"]

Of these, at least libmagic-dev and poppler-utils are definitely not supported.

liquidsec avatar Jun 17 '24 14:06 liquidsec

Relevant: https://github.com/blacklanternsecurity/bbot/issues/1467#issuecomment-2173624372

TheTechromancer avatar Jun 17 '24 14:06 TheTechromancer

@domwhewell-sage

liquidsec avatar Jun 17 '24 17:06 liquidsec

Hey, It looks like the equivalent package on arch for libmagic-dev is file and the one for poppler-utils is poppler. Arch has its own equivalents aswell. We could do something like this:

deps_ansible = [
        {
            "name": "Install Deps (Debian/Ubuntu)",
            "package": {"name": ["libmagic-dev", "poppler-utils", "tesseract-ocr", "libreoffice", "pandoc"], "state": "present"},
            "become": True,
            "when": "ansible_facts['os_family'] == 'Debian'",
        },
        {
            "name": "Install Deps (Arch)",
            "package": {"name": ["file", "poppler", "tesseract", "libreoffice", "pandoc"], "state": "present"},
            "become": True,
            "when": "ansible_facts['os_family'] == 'Archlinux'",
        },
        {
            "name": "Install Deps (Fedora)",
            "package": {"name": ["file-devel", "poppler-utils", "tesseract", "libreoffice", "pandoc"], "state": "present"},
            "become": True,
            "when": "ansible_facts['os_family'] == 'Fedora'",
        },
    ]

Although Im not sure how confident I am in this solution as the developer does not state how to install unstructured dependencies on archlinux / fedora-latest / gentoo or alpine...

domwhewell-sage avatar Jun 18 '24 15:06 domwhewell-sage

Yes, that looks good. Up to this point, any time I've needed to test something on a specific distro, I've used docker:

docker run --rm -it archlinux

Very soon I want to set up tests for each of the main distro families, so we won't have to worry about testing this kind of thing manually.

TheTechromancer avatar Jun 18 '24 15:06 TheTechromancer

@domwhewell-sage if you end up making a PR, can you fork from this branch? That will hopefully show us which distros are passing.

TheTechromancer avatar Jun 18 '24 17:06 TheTechromancer