netpalm icon indicating copy to clipboard operation
netpalm copied to clipboard

Docker containers keep restarting

Open radtrentasei opened this issue 2 years ago • 8 comments

I have done a fresh install of netpalm but the containers do not stay UP.

(py3venv) [developer@devbox ~]$ docker ps
CONTAINER ID   IMAGE                           COMMAND                  CREATED             STATUS                          PORTS                                       NAMES
802b12dd5126   netpalm_netpalm-worker-fifo     "python3 worker.py f…"   About an hour ago   Restarting (1) 26 seconds ago                                               netpalm_netpalm-worker-fifo_1
ce39b3e672e8   netpalm_netpalm-worker-pinned   "python3 worker.py p…"   About an hour ago   Restarting (1) 6 seconds ago                                                netpalm_netpalm-worker-pinned_1
706c38ed13ec   netpalm_netpalm-controller      "/bin/sh -c 'gunicor…"   About an hour ago   Up 7 seconds                    0.0.0.0:9000->9000/tcp, :::9000->9000/tcp   netpalm_netpalm-controller_1
300a7e0bfa37   netpalm_redis                   "docker-entrypoint.s…"   About an hour ago   Up About an hour                6379/tcp                                    netpalm_redis_1

This is the error I am getting when running docker-compose not in background

netpalm-worker-fifo_1    | TypeError: To define root models, use `pydantic.RootModel` rather than a field called '__root__'

radtrentasei avatar Oct 06 '23 18:10 radtrentasei

The problem is the pedantic version

In the requirement.txt most python module have no version or pinned down to a specific version:

requirements.txt
fastapi
ttp
netmiko==3.3.2
napalm
ncclient==0.6.9
requests
redis==4.5.1
rq
xmltodict
jinja2
jinja2schema
jsonschema
genie
pyyaml
cachelib==0.3.0
python-redis-lock
filelock
jsonpath_ng
apscheduler==3.6.3
puresnmp==1.9.1
pydantic
names_generator==0.1.0

That is bad practice and exactly causes such issues. The proper way to define requirement is to define a version range. Here pedantic release a new major version 2.0 on 2023-06-30, the last version 1, v1.10.13 to be precise was released on 2023-09-27. The code was not adjusted to comply with the latest version and as no version was given in the requirements.txt the latest version is used and causes this issue.

if you change the line in the requirements.txt for the pydantic module, rebuilt the containers it works again. pydantic>=1.10.13,<2.0

Usually all dependencies in requirements.txt should have a range from current to next major/minor release to get all security updates, but avoid problems with breaking changes between versions.

Pining down modules to specific version is also a bad idea as you miss out on all security updates.

empusas avatar Oct 09 '23 10:10 empusas

Thanks, any chance we can integrate in the main repo? I am happy to trigger the PR if needed.

radtrentasei avatar Oct 10 '23 06:10 radtrentasei

Feel free to open PR and i will look to merge

tbotnz avatar Oct 10 '23 08:10 tbotnz

I've stopped the PL because I've noticed that the containers are still reloading. This time I think the redis certificate has expired:

netpalm-controller_1     | redis.exceptions.ConnectionError: Error 1 connecting to redis:6379. [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1131).

@tbotnz would you be able to help?

radtrentasei avatar Oct 10 '23 13:10 radtrentasei

There is a script in the main directory of the repo called "redis_gen_new_certs.sh". Run the script, then rebuild the containers. It will work after that. You can then include the new certificates in your PR.

empusas avatar Oct 10 '23 16:10 empusas

I've been able to make them work and I've triggered the PR. Please review

radtrentasei avatar Dec 06 '23 17:12 radtrentasei

@tbotnz any chance you can review please?

radtrentasei avatar Jan 09 '24 19:01 radtrentasei