migrate away from conda
Hi! I propose migrating away from Conda to Venv I have been using Venv for garak development on my local machine without issues
However, Conda can handle non-Python packages is this an important consideration for this project?
What do you think? @leondz @jmartin-tech @erickgalinkin
There are some other alternatives we can also consider: https://github.com/mamba-org/mamba https://github.com/astral-sh/uv
I think we could probably use poetry for package management and suggest venv for env management?
https://python-poetry.org/
Venv it could be: see also the sensible comments in that The Register article about the business dynamics.
BTW, I have been running garak fine even in Termux (sic, albeit only in its prooted Debian - actually due to some glibc virtualization business: glibc-runner being needed in pure Termux), without any condas, and that with the local ggufs:
root@localhost:~# garak --model_type ggml --model_name /storage/emulated/0/LLMs/MobileVLM-3B-Q4_K_M.gguf --probes divergence
garak LLM vulnerability scanner v0.10.0.post1 ( https://github.com/leondz/garak ) at 2024-11-04T03:56:46.907373
📜 logging to /root/.local/share/garak/garak.log
🦜 loading generator: ggml: /storage/emulated/0/LLMs/MobileVLM-3B-Q4_K_M.gguf
📜 reporting to /root/.local/share/garak/garak_runs/garak.29857c0b-0247-471c-840b-e9a11aebc2c0.report.jsonl
🕵️ queue of probes: divergence.Repeat
divergence.Repeat divergence.RepeatDiverges: FAIL ok on 115/ 180 (failure rate: 36.11%)
📜 report closed :) /root/.local/share/garak/garak_runs/garak.29857c0b-0247-471c-840b-e9a11aebc2c0.report.jsonl
on puny:
root@localhost
--------------
OS: Debian GNU/Linux 12 (bookworm) aarch64
Host: realme RMX3085
Kernel: 6.2.1-PRoot-Distro
Uptime: 2 mins
Packages: 1 (pacman), 1619 (dpkg), 1 (pkg)
Shell: bash 5.2.15
Terminal: proot
CPU: MT6785V/CD (8) @ 2.000GHz
Memory: 3477MiB / 5638MiB
- it works. So I gather most of the readers can decide if to conda or venv anything or not.
PS1. Yours is amazing software: I have a deja vu from 2021 chats with the LLMs when reading now some of your tests. PS2. I have encountered even more vulnerabilities, with even TinyLamma writing sci-fi in pseudoLatin (sic!), after its context collapse upon trying to recreate a Wiki article about haruspices but it is hard to even describe by now how I had discovered it, by chance, to add to the probes...
@leondz
We have been using Poetry at ManimCommunity/manim for quite some time now and it's been working pretty well. It's great because those who want to use venv instead can still do an editable install of the library without messing about with poetry. Poetry also provides easy methods for publishing to PyPI as well, if that's relevant.
I don't think garak has any non-python dependencies (so far), but in the event that it does, we can simply package Garak into a container image (like a Docker Image) for easy use.
All in all, I'd strongly recommend Poetry for dependency management and environment management for developers, and venv for environment management for end-users.
In case we end up taking this route, I'd love to be assigned to this task. I think we can definitely get it done by the 24.12 milestone.
@Aathish04 thank you!
Poetry looks good and is becoming stable. The main question mark I have about this issue is validation of the migrated setup. I guess if one can:
a. manage a clean install from poetry that passes all the tests b. identify everywhere in the docs that we mention conda and validate that poetry steps work instead c. verify the situation with windows+ecoji, and (iirc) non-Linux+detectors.fileformats.FileIsExectuable - i.e. places where python packages aren't always enough
-- then a PR should be ready for review.
Re: venv for end-users: is there not overhead in having two environment systems to support? Am happy to let end users solve this themselves - maybe they don't even need to worry about containerisation, in the default case
@leondz Alright, I did some more digging on how complex the migration will be, and it seems like garak's current pyproject.toml is using certain features of PEP621 that have not yet been released in Poetry.
In particular, garak makes use of the [project] section in the pyproject.toml that has only recently been merged into the main branch of Poetry: Relevant Issue: https://github.com/python-poetry/poetry/issues/3332 Relevant Pull Request: https://github.com/python-poetry/poetry/pull/9135
The folks over at Poetry say they'll release this by the end of the year, hopefully.
I think it would be best to wait until that gets released before we move to Poetry entirely. It would be a little pointless to migrate to poetry's current spec now, and then move back to the "old" spec (which will be Poetry's "new" spec) once poetry releases it :)
Garak currently uses conda only for managing the environment during development. To be honest, venv itself would suffice for this.
If you give the go-ahead, I'd be happy to update all of the docs to follow this, though it'll probably be very similar to #872
verify the situation with windows+ecoji
The windows+ecoji issues seems to be fixed in ecojiv0.1.1. Relevant Issue/PR: https://github.com/mecforlove/ecoji-py/pull/5
and non-Linux+detectors.fileformats.FileIsExectuable
I believe this is already handled properly - the relevant error is raised if the libmagic library is not found.
Reg. end-users, ideally, they should be using virtual environments to manage their python projects anyhow, but like you said, we don't have to enforce this at all.
Hi, I'm just starting out with garak and it seems to be working well with mise and uv:
git clone https://github.com/NVIDIA/garak && cd garak
mise use -g uv@latest
uv venv
uv pip install .
source .venv/bin/activate
garak --model_type huggingface --model_name gpt2 --probes dan.Dan_11_0
Thanks.
time to revisit @parkanzky @jmartin-tech