qiita icon indicating copy to clipboard operation
qiita copied to clipboard

unchunked archive/observations communication

Open sjanssen2 opened this issue 1 year ago • 1 comments

Hi Qiita team, we just ran into the issue that our fresh Qiita instance did process a real world sized (~700 samples) 16S study. Since DB is fresh, no SEPP placements were stored. Unfortunately, the API call https://github.com/qiita-spots/qp-deblur/blob/efd59e3cd6ea176557633bbfd86eafd28072597a/qp_deblur/deblur.py#L506-L507 failed with error message:

Error executing Deblur 2021.09: ['Traceback (most recent call last):\n', ' File "/homes/sjanssen/bcf_qiita/envs/deblur/lib/python3.5/site-packages/qiita_client/plugin.py", line 266, in __call__\n qclient, job_id, job_info[\'parameters\'], output_dir)\n', ' File "/homes/sjanssen/bcf_qiita/envs/deblur/lib/python3.5/site-packages/qiita_client/plugin.py", line 105, in __call__\n return self.function(qclient, server_url, job_id, output_dir)\n', ' File "/homes/sjanssen/bcf_qiita/envs/deblur/lib/python3.5/site-packages/qp_deblur/deblur.py", line 507, in deblur\n path=job_id, value=json.dumps(new_placements))\n', ' File "/homes/sjanssen/bcf_qiita/envs/deblur/lib/python3.5/site-packages/qiita_client/qiita_client.py", line 470, in patch\n return self._request_retry(self._session.patch, url, **kwargs)\n', ' File "/homes/sjanssen/bcf_qiita/envs/deblur/lib/python3.5/site-packages/qiita_client/qiita_client.py", line 375, in _request_retry\n % (req.__name__, url, r.status_code, r.text))\n', "RuntimeError: Request 'patch https://qiita.jlab.bio/qiita_db/archive/observations/' did not succeed. Status code: 413. Message: <html>\r\n<head><title>413 Request Entity Too Large</title></head>\r\n<body>\r\n<center><h1>413 Request Entity Too Large</h1></center>\r\n<hr><center>nginx/1.25.3</center>\r\n</body>\r\n</html>\r\n\n"]

Which is due to a relatively small max_body_size=7M configuration for nginx. After increasing to 100M, it worked. The file size was ~22M. I wonder if it is worth to implement chunking similarly to file upload to prevent issued with really huge data transfer?

sjanssen2 avatar May 02 '24 11:05 sjanssen2

We haven't had the need of implementing any chucking mechanism for those pages but we are able to control our configuration (nginx) so we can raise those values "easily" and limit entry to those endpoint. For example, we limit the client_max_body_size based on the request page and the specific request to nginx.

However, I think this is a combination of what you can do in your installation and personal preference.

Anyway, FWIW the client_max_body_size is the main qiita site (depending on the entry point) are: 300M, 600M, 1500M.

antgonza avatar May 02 '24 13:05 antgonza

Closing for now, please reopen if you have further questions.

antgonza avatar May 15 '24 12:05 antgonza