Installation issues: Python/xgboost/numpy/tables incompatibilities,
Hi, and thanks for this tool!
I'm trying to install GenoML2 on a Linux system (Ubuntu), and despite following the README instructions, I encountered several difficulties due to dependency mismatches and compatibility issues. System information OS Platform and Distribution: Linux Ubuntu 24.04.2 LTS GenoML Installed from: source (cloned from GitHub) Python Version: tested with 3.7, 3.9, and 3.10 Test on example data
Describe the current behavior: Following the README instructions results in multiple dependency conflicts and runtime errors. Installation is very difficult due to mismatched Python requirements, unavailable package versions (e.g. xgboost==2.0.3 for Python 3.7), and binary incompatibilities between numpy and tables. In addition, running genoml discrete supervised munge fails after PLINK finishes, due to a TypeError in xarray.
Code to reproduce the issue: Provide a reproducible test case that is the bare minimum necessary to generate the problem.
create conda environment
conda create -n genoml_env python=3.9 -y conda activate genoml_env
install dependencies
pip install rpy2 docopt pandas_plink numpy requests tqdm xgboost==2.0.3
clone and install genoml2
git clone https://github.com/GenoML/genoml2.git cd genoml2 pip install .
run the tool
genoml discrete supervised munge
--prefix outputs/test_discrete_geno
--geno examples/discrete/training
--pheno examples/discrete/training_pheno.csv
Other Information / Logs
TypeError: Cannot interpret 'string[python]' as a data typeValue
Error: numpy.dtype size changed, may indicate binary incompatibility.Expected 96 from C header, got 88 from PyObject