dalai icon indicating copy to clipboard operation
dalai copied to clipboard

Installation stuck in Ubuntu 20 (WSL)

Open lesmo opened this issue 2 years ago • 0 comments

I've been debugging this for a couple of hours now, and I've seen some other folks face a similar issue where the installation will get stuck in many envs. I'm trying to get this to work under Linux Ubuntu 20, which is running inside WSL (but at this point that detail is kinda irrelevant).

TL;DR

Just use Docker or VSCode DevContainers.

For Linux

Install dalai and ignore it's postinstall script:

$ npm install -g dalai --no-package-lock --ignore-scripts

You'll need to make sure to manually install OS dependencies that the postinstall script tries to install:

$ apt-get install build-essential python3-venv -y
$ python -m venv ~/venv
$ pip install install torch torchvision torchaudio sentencepiece numpy

Note: I think it's bad practice to use such a generic venv, but... whatever.

Preamble

Before we begin, I encourage everyone to try this command which will provide more info than just "it doesn't work" or "it's stuck" without much context:

$ npm i -g dalai --loglevel=silly

Sadly, this doesn't help much. My install gets stuck with these last lines:

npm info run [email protected] install { code: 0, signal: null }
npm info run [email protected] postinstall node_modules/dalai node setup
npm info run [email protected] postinstall node_modules/dalai/node_modules/node-pty node scripts/post-install.js
npm info run [email protected] postinstall { code: 0, signal: null }
(##################) ⠇ reify:dalai: info run [email protected] postinstall { code: 0, signal: null }        

Debugging

So, I decided to debug this.

After cloning this repo, I ran this inside it's folder:

$ npm run postinstall

To my surprise, this is the output I'm getting:

> [email protected] postinstall
> node setup

mkdir /home/lesmo/dalai
exec: apt-get install build-essential python3-venv -y in undefined
apt-get install build-essential python3-venv -y
exit
Enter passphrase for /home/lesmo/.ssh/id_rsa:

Thoughts

Big note: I'm using SSH and SSH Agent on a daily basis to interact with Git, and Github. There's nothing that could be triggering an SSH need (plus my keys are with the agent, so I should not be asked to do this)... so I'm pretty confused.

Doing some further digging I found that indeed setup.js is, for some reason, running ssh-add:

├─node,30411 ./setup.js
│   ├─bash,30429
│   │   └─ssh-add,30619

Funnily enough, because setup.js is not prepared to take input, the code will forever get stuck there. I'm not sure why these prints are not making it to the npm install --loglevel=silly command tho.... this is why a lot of people are puzzled as to why it's stuck, and no further info.

Having SSH keys setup is pretty common, but not so much as to have a lot of people complaining (as we can sse from the issues in this repo). I'll try to figure out a solution, but I'm sharing my findings in case someone has more ideas too.

I'm pretty sure this happens to many other Linux users too. My conclusion is the spawned bash process contains no env vars at all, so it's just starting from scratch thus provoking another SSH Agent to spawn... which opens a strange can of worms I'm not equiped (yet) to understand.

So I'm thinking the best solution is to just copy what VSCode already does:

const { spawn } = require('child_process');

const command = 'ls';
const args = ['-al'];

const child = spawn(command, args);

child.stdout.on('data', (data) => {
  console.log(`stdout: ${data}`);
});

child.stderr.on('data', (data) => {
  console.error(`stderr: ${data}`);
});

child.on('close', (code) => {
  console.log(`child process exited with code ${code}`);
});

This way, all child processes run using the parent's context and everyone's happy,

lesmo avatar Mar 22 '23 06:03 lesmo