tasking-manager icon indicating copy to clipboard operation
tasking-manager copied to clipboard

Review Task Submission and Validation api query performance

Open dakotabenjamin opened this issue 3 years ago • 4 comments

Users on slack have noticed that there can be some latency when trying to submit a task as complete or validated. Last week we reduced the instance tier of the database service- it is still performing well from a CPU/Mem standpoint but these issues were not noticable before. From an infrastructure side we can increase some postgresql server parameters (on RDS some parameters are dependent on current instance memory), but we should also examine the queries behind these api calls. In particular, DB Insights noticed these long-running queries:

UPDATE users SET tasks_mapped=? WHERE users.id = ?;
UPDATE users SET last_validation_date=?::timestamp WHERE users.id = ?;
SELECT count(*) AS count_1 FROM tasks WHERE tasks.task_status IN (?, ?);

From New Relic, about 60% of the slow database transactions are coming from backend.api.tasks.actions:tasksactionsmappingunlockapi

dakotabenjamin avatar Feb 18 '22 21:02 dakotabenjamin

I was curious so I took a little dive into this. I do see some optimization opportunities in the codes that update the stats (which seems to be correlated to the long running queries and originate from tasksactionsmappingunlockapi).

In unlock_task_after_mapping, we call StatsService.update_stats_after_task_state_change() . This function takes in a project_id so that it can fetch an entire Project object; this includes fetching the geometry of the project when all the function ends up doing is updating a few stats for the project. I think this also aligns with a comment you made in slack @dakotabenjamin about the project in question having a complex geometry.

The Project object has always been a speed bump (literally) due to the sheer size of the object. Perhaps it's time to take look at Project again and see where we can clean some things up. For this issue, it may be best to break the stats columns out of the projects table and into its own table (project_stats maybe). Task unlocking is arguably (but I'd put money on it) one of if not the most common action in the TM. There's no real reason to be fetching an entire Project object just to update stats. Overall, I think the projects table can be simplified and we can just handle everything with existing or new DTO's that may just need to query an extra table. But for most cases (especially task actions) it should improve performance as well as simplify the codebase. Like I said, the Project object has always been a bit bloated IMO.

I hope this helps and isn't just noisy ramblings 😃 . Also, this was the most glaring thing I noticed when I followed the call stack. I'm not claiming this is the sole reason for potential latency issues and there are likely other optimizations elsewhere in the code.

zlavergne avatar Feb 23 '22 16:02 zlavergne

I am also getting messages about commenting and invalidating tasks:

Loading back to normal now, not by uploading after validation with or without a comment but this is happening in 1 on 3 cases for weeks now. Waiting doesn't help. Reload TM helps sometimes but of course than you have to rewrite the comment.

dakotabenjamin avatar Mar 01 '22 15:03 dakotabenjamin

few days back (10th of May 2022 around 11PM CEST) I wasn't able to upload task as validated. When I clicked "Submit task" it was loading for half a second and then nothing happened. I tried few times with the same result. Then I tooked screenshot of my console and refreshed the page. It logged me off so I logged back in, get back to validating page, pressed "Submit task" and all worked.

Not a big issue, not sure if it is related to this. My shot would be that it randomly logged me off while I was working on the task. TM error .

Patrik-Br avatar May 15 '22 10:05 Patrik-Br

Also related to #2003

Aadesh-Baral avatar Jun 21 '22 08:06 Aadesh-Baral