python-mapswipe-workers icon indicating copy to clipboard operation
python-mapswipe-workers copied to clipboard

OSMCha servers down breaks our project creation

Open laurentS opened this issue 3 years ago • 0 comments

Over the past week, osmcha.org has been down at least twice. This causes a crash in our project creation worker:

2022-09-09 07:42:43,433 - INFO - mapswipe_workers.py - cli - Logging enabled
2022-09-09 07:42:43,434 - INFO - mapswipe_workers.py - _run_creation - start mapswipe backend workflow to create projects and tutorials.
2022-09-09 07:42:43,698 - INFO - project.py - handle_input_type - TMId detected
2022-09-09 07:42:43,698 - INFO - api_calls.py - ohsome - Target: https://api.ohsome.org/v1/elements/geometry
2022-09-09 07:42:43,699 - INFO - api_calls.py - ohsome - Filter: building=* and geometry:polygon
2022-09-09 07:42:49,080 - INFO - api_calls.py - ohsome - Query succesfull.
2022-09-09 07:42:49,324 - INFO - api_calls.py - remove_noise_and_add_user_info - starting filtering and adding extra info
2022-09-09 07:42:49,349 - INFO - api_calls.py - remove_noise_and_add_user_info - 1191 changesets will be queried in roughly 12 batches from osmCHA
2022-09-09 07:43:02,124 - INFO - api_calls.py - remove_noise_and_add_user_info - finished query 1/12, 8.3

Traceback (most recent call last):
  File "/usr/local/bin/mapswipe_workers", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mapswipe_workers/mapswipe_workers.py", line 496, in run
    sched.every(time_interval).minutes.do(_run_creation).run()
  File "/usr/local/lib/python3.8/dist-packages/schedule/__init__.py", line 466, in run
    ret = self.job_func()
  File "/usr/local/lib/python3.8/dist-packages/mapswipe_workers/mapswipe_workers.py", line 478, in _run_creation
    context.invoke(run_create_projects)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mapswipe_workers/mapswipe_workers.py", line 88, in run_create_projects
    project.geometry = project.validate_geometries()
  File "/usr/local/lib/python3.8/dist-packages/mapswipe_workers/project_types/arbitrary_geometry/project.py", line 79, in validate_geometries
    self.handle_input_type(raw_input_file)
  File "/usr/local/lib/python3.8/dist-packages/mapswipe_workers/project_types/arbitrary_geometry/project.py", line 56, in handle_input_type
    result = ohsome(ohsome_request, self.geometry, properties="tags, metadata")
  File "/usr/local/lib/python3.8/dist-packages/mapswipe_workers/utils/api_calls.py", line 202, in ohsome
    response = remove_noise_and_add_user_info(response)
  File "/usr/local/lib/python3.8/dist-packages/mapswipe_workers/utils/api_calls.py", line 150, in remove_noise_and_add_user_info
    changeset_results = query_osmcha(subset, changeset_results)
  File "/usr/local/lib/python3.8/dist-packages/mapswipe_workers/utils/api_calls.py", line 63, in query_osmcha
    response = retry_get(url, to_osmcha=True)
  File "/usr/local/lib/python3.8/dist-packages/mapswipe_workers/utils/api_calls.py", line 34, in retry_get
    return session.get(url, timeout=timeout, headers=headers)
  File "/usr/local/lib/python3.8/dist-packages/requests/sessions.py", line 542, in get
    return self.request('GET', url, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/requests/sessions.py", line 529, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.8/dist-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='osmcha.org', port=443): Max retries exceeded with url: /api/v1/changesets/?ids=36752937,36752988,36753021,37106107,37180479,37241526,37258472,3726773
3,37360848,41525463,41526519,41526582,41526669,41526779,41527273,41527309,41527564,45908843,53723849,53725246,53731538,53805068,53741944,53743042,53743769,53745951,53758727,53762001,54254672,53764611,53774064,537
74555,53786473,53786625,54889914,53805627,53820619,53820879,54889281,48879172,18809967,18889988,18896991,32664782,18990707,18990799,18991135,18997191,50205186,19020092,50205775,50201334,19113790,19113873,11221638
6,19113910,38547028,19138075,19138430,19444578,19444710,26606956,50176767,48839244,26607135,26608173,26608520,26608483,26608498,39133422,26608674,26614645,49758755,49758043,26615115,49756399,26615409,49783925,497
58258,38863870,26615617,38647411,26618682,50206186,26618919,50205601,50202030,50205996,26630719,26630753,26631096,26631218,26631374,50201193,26631421,26631435,26631498,26631533,37232302,26631540 (Caused by ReadTi
meoutError("HTTPSConnectionPool(host='osmcha.org', port=443): Read timed out. (read timeout=4)"))
Sentry is attempting to send 0 pending error messages

This morning, their server is returning an HTTP error code 504 "Gateway timeout", on the API, but also when trying to access the site via a browser. I don't understand the project creation workflow enough to figure out what to do at this point, but we should prevent the worker from crashing entirely, and instead have it post a message to slack for project managers to know that there is an external problem. The job could be queued again for another attempt 1h later or something of that sort.

laurentS avatar Sep 09 '22 08:09 laurentS