cello icon indicating copy to clipboard operation
cello copied to clipboard

Recreating network should not recreate running node containers

Open xichen1 opened this issue 1 year ago • 1 comments

When a user deletes a network and creates a new one, errors are reported by both the api-engine and docker-agent. This issue arises because whenever a network is created, containers are also created for all nodes belonging to the current organization. You can refer to the relevant code here.

Due to Docker's restrictions, it is not possible to send "create" requests to existing containers, resulting in the following error. One possible solution is to filter the not running nodes on this line.

api-engine error:

Exception in thread Thread-79:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/var/www/server/api/routes/network/views.py", line 160, in _start_node
    raise e
  File "/var/www/server/api/routes/network/views.py", line 158, in _start_node
    raise ResourceNotFound(detail="Container Not Built")
api.exceptions.ResourceNotFound: Container Not Built

docker-agent error:

ERROR:server:Exception on /api/v1/nodes [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/docker/api/client.py", line 268, in _raise_for_status
    response.raise_for_status()
  File "/usr/local/lib/python3.8/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 409 Client Error: Conflict for url: http+docker://localhost/v1.41/containers/create?name=real-peer-20.node1.cello.com

The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2190, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1486, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/var/www/server/server.py", line 89, in create_node
    container = client.containers.run(
  File "/usr/local/lib/python3.8/site-packages/docker/models/containers.py", line 858, in run
    container = self.create(image=image, command=command,
  File "/usr/local/lib/python3.8/site-packages/docker/models/containers.py", line 917, in create
    resp = self.client.api.create_container(**create_kwargs)
  File "/usr/local/lib/python3.8/site-packages/docker/api/container.py", line 431, in create_container
    return self.create_container_from_config(config, name, platform)
  File "/usr/local/lib/python3.8/site-packages/docker/api/container.py", line 448, in create_container_from_config
    return self._result(res, True)
  File "/usr/local/lib/python3.8/site-packages/docker/api/client.py", line 274, in _result
    self._raise_for_status(response)
  File "/usr/local/lib/python3.8/site-packages/docker/api/client.py", line 270, in _raise_for_status
    raise create_api_error_from_http_exception(e) from e
  File "/usr/local/lib/python3.8/site-packages/docker/errors.py", line 39, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation) from e

docker.errors.APIError: 409 Client Error for http+docker://localhost/v1.41/containers/create?name=real-peer-20.node1.cello.com: Conflict ("Conflict. The container name "/real-peer-20.node1.cello.com" is already in use by container "4c97783b97b4ad987ba62961fe642c53efbb3cc5bd96d2cbb3724796321b5a03". You have to remove (or rename) that container to be able to reuse that name.")

xichen1 avatar Jun 30 '23 08:06 xichen1

When creating "existed" containers, is it possible to identify such case based on the docker engine's response msg? If so, we could just ignore the error and return OK.

yeasy avatar Jul 01 '23 02:07 yeasy