gregory icon indicating copy to clipboard operation
gregory copied to clipboard

Admin container can't run training for the Machine Learning models

Open brunoamaral opened this issue 2 years ago • 3 comments

brunoamaral avatar Apr 15 '22 14:04 brunoamaral

1_data_processor.py:

>>> dataset["summary"] = dataset["summary"].apply(html.unescape)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/usr/local/lib/python3.10/site-packages/pandas/core/series.py", line 4433, in apply
    return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
  File "/usr/local/lib/python3.10/site-packages/pandas/core/apply.py", line 1082, in apply
    return self.apply_standard()
  File "/usr/local/lib/python3.10/site-packages/pandas/core/apply.py", line 1137, in apply_standard
    mapped = lib.map_infer(
  File "pandas/_libs/lib.pyx", line 2870, in pandas._libs.lib.map_infer
  File "/usr/local/lib/python3.10/html/__init__.py", line 130, in unescape
    if '&' not in s:
TypeError: argument of type 'NoneType' is not iterable

brunoamaral avatar Jul 24 '22 22:07 brunoamaral

Pushing this up in the roadmap, because it would be nice to have the ML Model update itself.

brunoamaral avatar Oct 11 '22 18:10 brunoamaral

This issue is over a year old but is still relevant.

Been looking into it now and then but never made any progress trying to increase the docker resources. Maybe it's a host limitation ?

Steps to train the ML models:

  1. docker exec -it admin ./manage.py 1_data_processor
  2. docker exec -it admin ./manage.py 2_train_models

After which the command returns killed. For reference, we are running on a Digital Ocean droplet with 2 vCPU, 4 GB Memory.

Any ideas?

brunoamaral avatar May 14 '23 10:05 brunoamaral