automlbenchmark icon indicating copy to clipboard operation
automlbenchmark copied to clipboard

Error when using shared/setup.sh

Open paxcema opened this issue 4 years ago • 16 comments

Hi,

I am trying to add a framework to the benchmark suite. However, when doing python runbenchmark.py framework validation my setup.sh calls the shared/setup.sh and the latter fails at line 42 with the error .../venv/bin/pip: No such file or directory. I already tried using a shorter (54 characters, without white spaces) path, as it seems there can be errors related to that as well.

I also tried running python runbenchmark.py TPOT validation, and got the same error, which suggests the issue is not related to my setup.sh file.

Any ideas as to what it might be?

paxcema avatar Jul 29 '20 20:07 paxcema

Hi @paxcema, do you really get 3 dots in .../venv/bin/pip: No such file or directory. On which OS are you? from the automlbenchmark app folder, can you please run the following command and tell me the output?

cd frameworks/shared; SHARED_DIR="$(cd $(dirname "${BASH_SOURCE[0]}") && pwd -P)"; APP_ROOT=$(dirname $(dirname "$SHARED_DIR")); echo "shared=$SHARED_DIR"; echo "root=$APP_ROOT"; cd ../..

sebhrusen avatar Jul 29 '20 21:07 sebhrusen

sorry, wrong command, this one is closer to what is effectively done:

SHARED_DIR="$(cd $(dirname "frameworks/shared/setup.sh") && pwd -P)"; APP_ROOT=$(dirname $(dirname "$SHARED_DIR")); echo "shared=$SHARED_DIR"; echo "root=$APP_ROOT"

sebhrusen avatar Jul 29 '20 21:07 sebhrusen

Hi @sebhrusen!

No, to be precise the exact error in the .sh is: /home/pcerdam/mdb/amlb/frameworks/TPOT/../shared/setup.sh: line 42: /home/pcerdam/mdb/amlb/frameworks/TPOT/venv/bin/pip: No such file or directory.

The .py error is subprocess.CalledProcessError: Command '/home/pcerdam/mdb/amlb/frameworks/TPOT/setup.sh /home/pcerdam/mdb/amlb 0.11.5' returned non-zero exit status 127.

The output to that second command is:

shared=/home/pcerdam/mdb/amlb/frameworks/shared
root=/home/pcerdam/mdb/amlb

EDIT: OS is Ubuntu 16.04

paxcema avatar Jul 29 '20 21:07 paxcema

I am trying to add a framework to the benchmark suite

Ok, I guess you're not trying to add TPOT again, so I don't see why it's calling amlb/frameworks/TPOT/setup.sh.

Let's say you're trying to add framework XXX, then to test it, you run: python runbenchmark.py XXX and this should call internally amlb/frameworks/XXX/setup.sh, not TPOT.

IF you try to run TPOT however, it works, right? python runbenchmark.py tpot it will setup its venv the first time, and you don't get the previous error, correct?

Also, just wondering, is it a private/proprietary integration or some open source framework? In the latter case, do you mind showing a branch with your integration work? It will be easier for me to help.

sebhrusen avatar Jul 29 '20 21:07 sebhrusen

I also tried running python runbenchmark.py TPOT validation, and got the same error, which suggests the issue is not related to my setup.sh file.

Sorry, I completely missed this part... I really have no idea why it's failing then, I'll try on a local docker running Ubuntu 16.04 and come back to you.

sebhrusen avatar Jul 29 '20 21:07 sebhrusen

Actually, I did try it out for TPOT as-is and the error above is from executing the command python runbenchmark.py TPOT validation. Also tried python runbenchmark.py TPOT and it fails just the same. Maybe I'm missing something fundamental?

The integration is for the open-source MindsDB AutoML framework. The (little) progress I have made towards integrating can be found here.

paxcema avatar Jul 29 '20 21:07 paxcema

Just tried with a blank docker container running Ubuntu 16.04.

  1. I had to install python3.7-dev, because the default python3 (3.5.2) is not enough for the app because of one variable type annotation, which makes me think that I need to update the pre-requirements in the README....
  2. then had issues when testing TPOT, but far after the error you get, the call to shared.sh went absolutely fine:
  3. also tried autosklearn and got different errors...
  4. all this got solved after ensuring that python3.7 was my default python3 using:
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.7 10

to sum up:

apt install software-properties-common
add-apt-repository ppa:deadsnakes/ppa
apt-get -y install python3.7 python3.7-venv
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.7 10
python3 -m pip install -U pip

cd /path/to/amlb
pip install -r requirements.txt
python3 runbenchmark.py autosklearn

our docker images and AWS instances all use Ubuntu 18.04, and I didn't expect so many issues on Ubunt 16.04 (even after this, tpot fails for regression there).

Also, please rebase your branch against master, I just pushed an important fix (liac-arff upgrade) for new users working with latest code: without this upgrade it will fail when downloading new openml datasets.

Please tell, if the changes suggested make sense for you. Maybe a simpler approach for you would be to create a virtualenv with Python 3.7+ for amlb and activate it before running python3 runbenchmark.py autosklearn. Again, it may not be related, as I didn't hit your issue with a fresh install on Ubuntu 16.04, but several other issues related with python path and version.

sebhrusen avatar Jul 29 '20 23:07 sebhrusen

Thank you for the detailed reply.

I tried doing this with a macOS (10.15.6) fresh install, but the same error happens. I think it is caused because of the script not actually copying the pip executable in the venv directory. I will try creating the environment manually to bypass this step.

EDIT: This is after rebasing against master.

EDIT #2: TPOT validation now works, at this point I'm fairly confident there is an issue with the shared/setup.sh file, but I'm afraid I don't know how to fix it.

paxcema avatar Aug 03 '20 19:08 paxcema

@paxcema that's really weird. I'm working on macOS myself (10.15.5), and as you can imagine I regularly reinstall frameworks locally to test new versions, and I've never met this issue, esp, all frameworks have pip in their venv/bin.

Maybe this is the python version? Are you using the system python exec? The app has been tested on macOS with official python versions (3.6+) installed with brew or pyenv, on Ubuntu 18.04 with default python3 package, on Ubuntu 16.04 see previous message, but apparently it fails with some frameworks anyway, so it's not supported.

sebhrusen avatar Aug 03 '20 20:08 sebhrusen

Hi, I got the same error when I ran the same script multiple times at once, or when the setup was interrupted. Usually it suffices to delete the whole venv directory and then after rerunning the setup, everything works (you may need to force the setup)

gabikadlecova avatar Aug 13 '20 19:08 gabikadlecova

Hi @gabrielasuchopar, if the setup is interrupted during the creation of the virtual env, I agree that it's better to simply delete it... usually enforcing setup is enough (using --setup force or -s force), but depending on how the setup.sh of the framework was defined, it will be a completely clean setup or a partial one.

What I could do is adding a clean option to do this consistently, but usually, you're right, deleting lib and venv subfolders if any, is enough.

Btw, what do you mean exactly by

when I ran the same script multiple times at once

Do you mean on multiple terminals on the same machine? with more than one also doing the setup? If so, I recommend to run the setup once first in one single terminal:

python runbenchmark me_framework -s only

and then run the benchmark scripts in multiple terminals. Am I understanding correctly?

sebhrusen avatar Aug 17 '20 17:08 sebhrusen

Hi @sebhrusen , thanks for your reply,

it will be a completely clean setup or a partial one

exactly, usually enforcing setup was enough, but for some of the frameworks the error persisted. The clean option would be nice, and also extensible, some future frameworks might e.g. preload some weights which could be deleted this way.

Do you mean on multiple terminals on the same machine?

Yes, it was the case of multiple terminals on the same machine (CentOS 7); I now use the '-s only' option, it took me some time to figure out what the problem was though.

gabikadlecova avatar Aug 17 '20 20:08 gabikadlecova

@gabrielasuchopar thanks a lot for your feedback, I'll add an entry in the troubleshooting guide (and also make it more visible/accessible from the README) for now, and create a ticket for the clean option.

some future frameworks might e.g. preload some weights which could be deleted this way

do you have any existing framework in mind? :)

sebhrusen avatar Aug 17 '20 20:08 sebhrusen

Thanks @gabrielasuchopar for your tips.

As a sidenote in this discussion, I will soon PR the initial MindsDB framework integration. Maybe then we can figure out what (if any) is the exact issue, because I tried using docker as @sebhrusen indicated to no avail. The weird thing is that once I bypassed the issue by manually creating the venv, I wrote the integration code and now executing python runbenchmark.py MindsDB -m docker seems to work.

paxcema avatar Aug 17 '20 21:08 paxcema

@sebhrusen

do you have any existing framework in mind? :)

I suppose TPOT has recently added support for PyTorch models, that's why it came to my mind. :) I think it trains them only from scratch though, as for now.

gabikadlecova avatar Aug 20 '20 00:08 gabikadlecova

Seems like this issue has gone stale and is hopefully irrelevant with the newer releases, please let us know if you still experience the issue. I suggest we close this issue and open an issue for the suggested clean option.

What I could do is adding a clean option to do this consistently, but usually, you're right, deleting lib and venv subfolders if any, is enough.

I think that's good, though certain frameworks (e.g. R- and Java-based) might install software outside of their venv and lib folders. I don't know if it's possible to install those tools to a framework subdirectory (e.g. lib) and uninstall them safely if other local installations of the tool also exist (though that should be the responsibility of those tools).

PGijsbers avatar Sep 17 '21 10:09 PGijsbers