burn icon indicating copy to clipboard operation
burn copied to clipboard

MNIST example fails to find pip and install deps

Open benborder opened this issue 2 years ago • 8 comments

Describe the bug When running the MNIST example, it fails to download the MNIST dataset, producing the following output:

/home/ben/.cache/burn-dataset/venv/bin/python3: No module named pip
Traceback (most recent call last):
  File "/home/ben/.cache/burn-dataset/importer.py", line 3, in <module>
    import pyarrow as pa
ModuleNotFoundError: No module named 'pyarrow'
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: SqliteDataset(ConnectionPool(Error(Some("unable to open database file: /home/ben/.cache/burn-dataset/mnist.db"))))', /home/ben/.cargo/registry/src/index.crates.io-6f17d22bba15001f/burn-dataset-0.9.0/src/source/huggingface/mnist.rs:86:14
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Seems pip can't be found and the python dependencies can't be installed.

Expected behavior The venv python finds pip and installs the python dependencies

Related Looking around it seems there was a PR fixing a similar issue: #496

Desktop (please complete the following information):

  • OS: Ubunutu 22.04
  • Python and pip are installed via system package manager apt:
    • python3 package version: 3.10.6-1~22.04
    • python3-pip package version: 22.0.2+dfsg-1ubuntu0.3
  • python3 --version outputs 3.10.12
  • pip3 --version outputs pip 22.0.2 from /usr/lib/python3/dist-packages/pip (python 3.10)

benborder avatar Sep 07 '23 04:09 benborder

Could please share python version you have installed? Pip comes with python 3.4 version.

Maybe we should add a version check or use ensurepip package to makes sure it is installed. Here is more on this here https://bobbyhadz.com/blog/python-no-module-named-pip

antimora avatar Sep 08 '23 02:09 antimora

I'm using the system python version and pip version installed via apt on Ubuntu 22.04. python3 --version: returns 3.10.12. I updated the description with relevant version info.

I tried following the linked guide to ensure pip was installed by running python3 -m ensurepip --upgrade, but it produced the following error:

ensurepip is disabled in Debian/Ubuntu for the system python.

Python modules for the system python are usually handled by dpkg and apt-get.

    apt install python3-<module name>

Install the python3-pip package to use pip itself.  Using pip together
with the system python might have unexpected results for any system installed
module, so use it on your own risk, or make sure to only use it in virtual
environments.

The guide suggests using sudo apt install python3-venv python3-pip to install for ubuntu/debian.

I think maybe there is some issue with using the system package manager version of python and pip?

benborder avatar Sep 08 '23 03:09 benborder

@benborder , thanks for the update. Hopefully, someone will have more details on this.

This resource might help whoever is gonna be working on this: https://askubuntu.com/questions/879437/ensurepip-is-disabled-in-debian-ubuntu-for-the-system-python

antimora avatar Sep 08 '23 03:09 antimora

Same Error full error info Traceback (most recent call last): File "C:\Users\XXXX.cache\burn-dataset\importer.py", line 3, in import pyarrow as pa File "C:\Users\XXXX.cache\burn-dataset\venv\Lib\site-packages\pyarrow_init_.py", line 65, in import pyarrow.lib as _lib ModuleNotFoundError: No module named 'pyarrow.lib' thread '' panicked at src/lib.rs:27:18: called Result::unwrap() on an Err value: SqliteDataset(ConnectionPool(Error(Some("unable to open database file: C:\Users\XXXX\.cache\burn-dataset\ag_news.db"))))

My solution: 1.delete the C:\Users\lette.cache\burn-dataset\venv dir 2.rerun

PS The burn-dataset seems use the virtual environment(venv) to avoid influence your existed python env, but forgot to refresh it. Obviously a bug.

letttop avatar Sep 14 '23 16:09 letttop

@Letter-R How would you refresh the virtual environment? Maybe this is a dependency bug, I don't think we are setting a version for python dependencies.

nathanielsimard avatar Sep 15 '23 13:09 nathanielsimard

I ran into the above issue (No module named pip). I've no idea what "refreshing" the venv means, but I solved it via:

sudo apt install python3.8-venv
rm -rf  ~/.cache/burn-dataset
cd ~/burn/examples/guide
cargo run --example guide

However, this gives me a rather generic error after ~5min:

zsh: segmentation fault cargo run --example guide

so I'm drumming my fingers on the Google.


Edit: okay it's working now? After running the above commands ~3 more times it I'm seeing the TUI when before I just saw a blank (cleared) terminal.


Edit2: nnnnnope segfaulted about 1% in.

image


Edit 3: Segfaults might be due to running in debug mode; cargo run --example mnist --release --features ndarray is working fine so far 🤞

AlexErrant avatar Nov 02 '23 17:11 AlexErrant

@AlexErrant segfaults is not normal even when debug mode. I guess you are using the ndarray backend?

nathanielsimard avatar Nov 03 '23 12:11 nathanielsimard

Currently guide uses wgpu. Turns out (for me) it only segfaults with wgpu. tch-cpu and the various ndarrays all work. tch-gpu just exits immediately without an error message (and a nonzero exit code).

I'm on WSL so that might be the cause. wgpu-info segfaults for me as detailed here. I'm happy to consider this an upstream issue and/or off-topic.

AlexErrant avatar Nov 11 '23 21:11 AlexErrant

@laggui removed python dependency. PR: https://github.com/tracel-ai/burn/pull/1283

Closing it as fixed.

antimora avatar Mar 01 '24 17:03 antimora