RPA-Python icon indicating copy to clipboard operation
RPA-Python copied to clipboard

Run python script in Linux with error - internet restriction, try out these options

Open vegaviazhang opened this issue 2 years ago • 12 comments

Hi kensoh, Your RPA work is very attractive.Thanks a lot for this projects.

I encounter problem in run python script in Linux.I really hope you can help me solve the problem. Following is my step to try

1.install package in Linux

$ pip install rpa

2.write python script

*Here I hear from issues of this project for reference. I am using proxy to download https://github.com/tebelorg/Tump/releases/download/v1.0.0/TagUI_Linux.zip in python script.

import os
import rpa

os.environ['http_proxy'] = "http://192.168.200.18:xxx"
os.environ['https_proxy'] = "https://192.168.200.18:xxx"

r.init(visual_automation=False, turbo_mode=True)
r.url("https://www.baidu.com")
r.close()

3.Run the script

(vega_firenlp_super) zhangx@pve-gpu:~/a_project/p_supersimpletransformers_20211128$ python rpa_main_20220701.py
[RPA][INFO] - setting up TagUI for use in your Python environment
[RPA][INFO] - downloading TagUI (~200MB) and unzipping to below folder...
[RPA][INFO] - /home/zhangx
[RPA][INFO] - done. syncing TagUI with stable cutting edge version
<urlopen error Remote end closed connection without response>
[RPA][ERROR] - failed downloading from https://raw.githubusercontent.com/tebelorg/Tump/master/TagUI-Python/tagui.sikuli/tagui.py...
[RPA][ERROR] - use init() before using url()

3.1 My attempt

  • First, download success of the https://github.com/tebelorg/Tump/releases/download/v1.0.0/TagUI_Linux.zip .
  • Then, I think it may be accidental factors, such as proxy reasons, that led to the error here.So I try again.

4.I delete the unziped file.

rm -rf /home/zhangx/.tagui

5.Rerun the script in step 2.

(vega_firenlp_super) zhangx@pve-gpu:~/a_project/p_supersimpletransformers_20211128$ python rpa_main_20220701.py
[RPA][INFO] - setting up TagUI for use in your Python environment
[RPA][INFO] - downloading TagUI (~200MB) and unzipping to below folder...
[RPA][INFO] - /home/zhangx
[RPA][INFO] - done. syncing TagUI with stable cutting edge version
<urlopen error Remote end closed connection without response>
[RPA][ERROR] - failed downloading from https://raw.githubusercontent.com/tebelorg/Tump/master/TagUI-Python/end_processes.cmd...
[RPA][ERROR] - use init() before using url()

6.This is another error.

(vega_firenlp_super) zhangx@pve-gpu:~/a_project/p_supersimpletransformers_20211128$ python rpa_main_20220701.py
[RPA][INFO] - setting up TagUI for use in your Python environment
[RPA][INFO] - downloading TagUI (~200MB) and unzipping to below folder...
[RPA][INFO] - /home/zhangx
[RPA][INFO] - done. syncing TagUI with stable cutting edge version
<urlopen error Remote end closed connection without response>
[RPA][ERROR] - failed downloading from https://raw.githubusercontent.com/tebelorg/Tump/master/TagUI-Python/end_processes.cmd...
[RPA][ERROR] - use init() before using url()

7.Try third.Repeat step 4,step 5.

(vega_firenlp_super) zhangx@pve-gpu:~/a_project/p_supersimpletransformers_20211128$ python rpa_main_20220701.py
[RPA][INFO] - setting up TagUI for use in your Python environment
[RPA][INFO] - downloading TagUI (~200MB) and unzipping to below folder...
[RPA][INFO] - /home/zhangx
[RPA][INFO] - done. syncing TagUI with stable cutting edge version
<urlopen error Remote end closed connection without response>
[RPA][ERROR] - failed downloading from https://raw.githubusercontent.com/tebelorg/Tump/master/TagUI-Python/tagui...
[RPA][ERROR] - use init() before using url()

I have no way to deal with it.0.0

  • I don't install chrome in Linux. Because the linux is the server.It has no visual interface.

vegaviazhang avatar Jul 01 '22 03:07 vegaviazhang

Hi @vegaviazhang thanks for sharing these detail information. I think the root cause is network policy (maybe from company, or from China), that is blocking automated downloading of required files from GitHub. One way is to run on a computer with no internet restriction and use pack() to generate the zip file to copy to your PC. Another way is to manually download the required files and set them up. See this link for detailed step by step instructions - https://github.com/tebelorg/RPA-Python/issues/376#issuecomment-1113873150

kensoh avatar Jul 02 '22 22:07 kensoh

Hi @vegaviazhang thanks for sharing these detail information. I think the root cause is network policy (maybe from company, or from China), that is blocking automated downloading of required files from GitHub. One way is to run on a computer with no internet restriction and use pack() to generate the zip file to copy to your PC. Another way is to manually download the required files and set them up. See this link for detailed step by step instructions - #376 (comment)

Surprised! Kensoh .I have a try refer this link Today.https://github.com/tebelorg/RPA-Python/issues/376#issuecomment-1113873150.[download the required files and set them up.].

  • 1.The above problem disappear.^_^.
  • 2.But the question appear below.

r.init

(vega_firenlp_super) zhangx@pve-gpu:~/a_project/z_temp_rpa$ python
Python 3.7.11 (default, Jul 27 2021, 14:32:16)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import rpa as r
>>> r.init()
[RPA][ERROR] - following happens when starting TagUI...

The following command is executed to start TagUI -
"/home/zhangx/.tagui/src/tagui" rpa_python chrome

It leads to following output when starting TagUI -
/home/zhangx/.tagui/src/tagui: line 256: php: command not found
Auto configuration failed
140531240742528:error:25066067:DSO support routines:DLFCN_LOAD:could not load the shared library:dso_dlfcn.c:185:filename(libssl_conf.so): libssl_conf.so: cannot open shared object file: No such file or directory
140531240742528:error:25070067:DSO support routines:DSO_load:could not load the shared library:dso_lib.c:244:
140531240742528:error:0E07506E:configuration file routines:MODULE_LOAD_DSO:error loading dso:conf_mod.c:285:module=ssl_conf, path=ssl_conf
140531240742528:error:0E076071:configuration file routines:MODULE_RUN:unknown module name:conf_mod.c:222:module=ssl_conf

[RPA][ERROR] - unknown error encountered
False
>>>

My computer introduce

  • It is a Linux server.
  • I run a shell command about it.
(vega_firenlp_super) zhangx@pve-gpu:~$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

My confusion

  • Run RPA TagUI with the Linux no need to install PHP.
  • Is my understanding wrong?
  • Thanks very much.

vegaviazhang avatar Jul 04 '22 10:07 vegaviazhang

Ubuntu Linux does not have PHP installed, you can install it and try again. See this Ubuntu Colab notebook for example to install PHP - https://colab.research.google.com/drive/13bQO6G_hzE1teX35a3NZ4T5K-ICFFdB5?usp=sharing

kensoh avatar Jul 04 '22 10:07 kensoh

Year.I'll try it today and fill in the results later. ^.^

vegaviazhang avatar Jul 05 '22 00:07 vegaviazhang

  • I pulled a container and downloaded PHP according to this link. There is still a problem.

pull container

docker run -d --name vega_rpa_tagui ubuntu_python:20.04_3.8.3_sh  tail -f /dev/null

download the required files and set them up.

  • In container
  • pip install rpa==1.48.1
root@e2fe74821083:~/rpa_demo# ll /root/.tagui
total 108
drwxr-xr-x  4 1024 1024  4096 Jul  4 17:08 ./
drwx------  1 root root  4096 Jul  4 18:21 ../
-rwxr-xr-x  1 1024 1024  2100 Jul  4 17:08 .gitignore*
-rwxrwxrwx  1 1024 1024 11343 Jul  4 17:08 LICENSE.md*
-rwxrwxrwx  1 1024 1024 65833 Jul  4 17:08 README.md*
-rwxrwxrwx  1 1024 1024   572 Jul  4 17:08 package.json*
-rwxrwxrwx  1 1024 1024     0 Jul  4 17:08 rpa_python_1.48.1*
drwxrwxrwx 15 1024 1024  4096 Jul  4 18:15 src/
drwxrwxrwx  3 1024 1024  4096 Jul  4 17:08 tagui/

setup RPA environment by installing PHP, Chromium web browser and RPA for Python

  • refer to https://colab.research.google.com/drive/13bQO6G_hzE1teX35a3NZ4T5K-ICFFdB5?usp=sharing
apt-get update
apt install php
apt install chromium-browser

modify to run Chromium browser and new attempt

root@e2fe74821083:~/rpa_demo# python3
Python 3.8.10 (default, Sep 28 2021, 16:10:42)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import rpa as r
>>> r.dump(r.load('/root/.tagui/src/tagui').replace('"google-chrome"', '"chromium-browser"').replace('$headless_switch', '--no-sandbox'), '/root/.tagui/src/tagui')
True
>>> r.init()
[RPA][ERROR] - following happens when starting TagUI...

The following command is executed to start TagUI -
"/root/.tagui/src/tagui" rpa_python chrome

It leads to following output when starting TagUI -
/root/.tagui/src/tagui: line 304: type: chromium-browser: not found
ERROR - cannot find Chrome command "chromium-browser"
update chrome_command setting in tagui/src/tagui and make sure symlink to command is created

[RPA][ERROR] - unknown error encountered
False

The problem is, is there no symlink?

vegaviazhang avatar Jul 05 '22 02:07 vegaviazhang

Hi.Kensoh.According to the above tips, I feel that success is only one step away The steps I used to build the RPA Ubuntu environment using containers are as follows: didn't get response for a long time

1 pull ubuntu images/install python/create container

$ docker run -d  --name vega_rpa ubuntu_python:20.04_3.8.3_sh  tail -f /dev/null

$ # enter container 
$ docker exec -it vega_rpa /bin/bash

2.install php/chromium-browser/rpa

$ apt-get update
$ apt install php -y
$ apt install chromium-browser -y
$ pip3 install rpa  -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn

All the above codes were executed successfully

3.rpa.setup

root@be0f3a0acb74:/# python3
Python 3.8.10 (default, Sep 28 2021, 16:10:42)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>>
>>> os.environ['http_proxy'] = "http://192.168.200.18:3128" # 访问代理
>>> os.environ['https_proxy'] = "https://192.168.200.18:3128"
>>> import rpa as r
>>> r.setup()
[RPA][INFO] - setting up TagUI for use in your Python environment
[RPA][INFO] - downloading TagUI (~200MB) and unzipping to below folder...
[RPA][INFO] - /root
[RPA][INFO] - done. syncing TagUI with stable cutting edge version
[RPA][INFO] - TagUI now ready for use in your Python environment
[RPA][INFO] - visual automation (optional) requires special setup on Linux,
[RPA][INFO] - see the link below to install OpenCV and Tesseract libraries
[RPA][INFO] - https://sikulix-2014.readthedocs.io/en/latest/newslinux.html
True
>>> r.dump(r.load('/root/.tagui/src/tagui').replace('"google-chrome"', '"chromium-browser"').replace('$headless_switch', '--no-sandbox'), '/root/.tagui/src/tagui')
True
>>> exit()
root@be0f3a0acb74:/# apt install xvfb -y

All the above codes were executed successfully

4.install xvfb and pyvirtualdisplay

$ apt install xvfb -y
$ pip3 install pyvirtualdisplay -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn

5. Test r.init()

root@be0f3a0acb74:/# python3
Python 3.8.10 (default, Sep 28 2021, 16:10:42)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import rpa as r
>>> r.init()
^CTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/tagui.py", line 584, in init
    tagui_out = _tagui_read()
  File "/usr/local/lib/python3.8/dist-packages/tagui.py", line 130, in _tagui_read
    global _process; return _py23_decode(_process.stdout.readline())
KeyboardInterrupt
>>> r.debug(True)
True
>>> r.init()

6.Refer-Version

  • Version rpa==1.48.1
root@be0f3a0acb74:/# pip3 install rpa==1.48.1
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
...
Successfully built rpa tagui
Installing collected packages: tagui, rpa
Successfully installed rpa-1.48.1 tagui-1.48.1
  • I set debug=True in r.init().Don't response also.
  • Hope to get your reply.

vegaviazhang avatar Jul 05 '22 10:07 vegaviazhang

Hi @vegaviazhang thanks for sharing these details, they are very helpful!

Normally when init() is stuck, the most common reason is that it cannot connect to the Chromium process. This may be due to port 9222 not opened or restricted for the tool to make a websocket connection to Chromium process, or could be other reasons with initialising Chromium. Can you try to run following command from your terminal and see what happens?

chromium-browser --user-data-dir="~/.tagui/src/chrome/tagui_user_profile" --remote-debugging-port=9222 about:blank --window-size=1366,768 --no-sandbox

After that, run the following command in the terminal to see if Chromium process websocket connection is opened up

curl -s localhost:9222/json

If it is opened successfully, there will be some result with about:blank as the url and there is a webSocketDebuggerUrl string in that result. After you try the above, let me know your findings so that we can find a solution together.

kensoh avatar Jul 05 '22 11:07 kensoh

Also, I notice you did not initialise the virtual display. Before you do r.init(), you have to do the following (from the Colab example). It is initialise the virtual display for Chromium browser to be able to run without a real display.

import pyvirtualdisplay; display = pyvirtualdisplay.Display(); display.start()

kensoh avatar Jul 05 '22 11:07 kensoh

Hi, kensoh. Today I have a try. Reslut are as follows.

1.repeated the problem that init() did not respond

  • i wait for 20 minutes.
root@426a5eaa37f2:/# python3
Python 3.6.9 (default, Mar 15 2022, 13:55:28)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import rpa as r
>>> import pyvirtualdisplay
>>> display = pyvirtualdisplay.Display()
>>> display.start()
<pyvirtualdisplay.display.Display object at 0x7f63b79c3320>
>>> r.debug(True)
True
>>> from datetime import datetime
>>> datetime.now()
datetime.datetime(2022, 7, 8, 11, 18, 37, 930608)
>>> r.init()
^CTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/tagui.py", line 584, in init
    tagui_out = _tagui_read()
  File "/usr/local/lib/python3.6/dist-packages/tagui.py", line 130, in _tagui_read
    global _process; return _py23_decode(_process.stdout.readline())
KeyboardInterrupt
>>> datetime.now()
datetime.datetime(2022, 7, 8, 11, 39, 6, 890309)
>>> exit()

2.Run chromium-browser

root@426a5eaa37f2:/# chromium-browser --user-data-dir="~/.tagui/src/chrome/tagui_user_profile" --remote-debugging-port=9222 about:blank --window-size=1366,768 --no-sandbox
[24132:24132:0708/114257.336296:ERROR:ozone_platform_x11.cc(247)] Missing X server or $DISPLAY
[24132:24132:0708/114257.336331:ERROR:env.cc(225)] The platform failed to initialize.  Exiting.
root@426a5eaa37f2:/#

3.curl with no result

root@426a5eaa37f2:/# curl -s localhost:9222/json
root@426a5eaa37f2:/#

End

  • I look forward to your reply

vegaviazhang avatar Jul 08 '22 03:07 vegaviazhang

Thanks for the details! Can you try below from the terminal to see what happens? Below will run using headless mode browser, see if that can load and curl -s localhost:9222/json can return some results.

chromium-browser --user-data-dir="~/.tagui/src/chrome/tagui_user_profile" --remote-debugging-port=9222 about:blank --window-size=1366,768 --no-sandbox --headless --disable-gpu

kensoh avatar Jul 08 '22 11:07 kensoh

Yes, I run the command using headless mode browser, it can return some result.

chromium-browser

root@426a5eaa37f2:/# chromium-browser --user-data-dir="~/.tagui/src/chrome/tagui_user_profile" --remote-debugging-port=9222 about:blank --window-size=1366,768 --no-sandbox --headless --disable-gpu
[0711/092410.960971:ERROR:bus.cc(398)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory
[0711/092410.961075:ERROR:bus.cc(398)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory

DevTools listening on ws://127.0.0.1:9222/devtools/browser/1ff51bff-21b2-457a-8272-c5e7797833a9
[0711/092411.414866:WARNING:bluez_dbus_manager.cc(248)] Floss manager not present, cannot set Floss enable/disable.
[0711/092411.890598:WARNING:sandbox_linux.cc(377)] InitializeSandbox() called with multiple threads in process gpu-process.

curl

root@426a5eaa37f2:/# curl -s localhost:9222/json
[ {
   "description": "",
   "devtoolsFrontendUrl": "/devtools/inspector.html?ws=localhost:9222/devtools/page/9D71AC29DAE17785EFC38416069D480C",
   "id": "9D71AC29DAE17785EFC38416069D480C",
   "title": "about:blank",
   "type": "page",
   "url": "about:blank",
   "webSocketDebuggerUrl": "ws://localhost:9222/devtools/page/9D71AC29DAE17785EFC38416069D480C"
} ]
root@426a5eaa37f2:/#

vegaviazhang avatar Jul 11 '22 01:07 vegaviazhang

From above so far, it looks like headless mode can return a websocket connection, however it is not working for virtual display. Maybe there is some unknown reason why the virtual display method is not working. You can try the following -

  1. search the file ~/.tagui/src/tagui and change --no-sandbox to $headless_switch --no-sandbox. after that run import rpa as r; r.init(headless_mode = True) to see what happens
  2. download and try the TagUI RPA engine (https://tagui.readthedocs.io/en/latest/setup.html) and run tagui /your_tagui_path/tagui/flows/samples/1_google.tag headless to see what happens

Above will try to isolate the issue to see if headless mode works for you and if the root cause is in the upstream TagUI RPA engine. So far, I haven't come across any user raising this issue, other than the recent similar issue at https://github.com/tebelorg/RPA-Python/issues/404

kensoh avatar Jul 11 '22 10:07 kensoh