write-math icon indicating copy to clipboard operation
write-math copied to clipboard

Eliminate JOINs in gallery for speedup

Open MartinThoma opened this issue 9 years ago • 0 comments

Currently, the gallery page is too slow, because of queries like

SELECT `wm_raw_draw_data`.`id`, `data` as `image`, `creation_date`,
       COUNT(`wm_partial_answer`.`recording_id`) as `answers`,
       COUNT(`wm_worker_answers`.`raw_data_id`) as `answers2`
FROM `wm_raw_draw_data`
LEFT OUTER JOIN `wm_partial_answer`
     ON (`recording_id` = `wm_raw_draw_data`.`id`)
LEFT OUTER JOIN `wm_worker_answers`
     ON (`wm_worker_answers`.`raw_data_id` = `wm_raw_draw_data`.`id`)
WHERE
    `accepted_formula_id` IS NULL AND `is_image`=0 AND
    `classifiable`=1 AND `stroke_segmentable`=1 AND
    `nr_of_symbols` = 1
GROUP BY `wm_raw_draw_data`.`id`
HAVING `answers` > 0 OR `answers2` > 0
ORDER BY `probability` DESC ";

This could probably be much faster if the counters answers (user answers) and answers2 (automated answers) were just simple rows of that table.

  • [x] Update wm_raw_draw_data: Add rows user_answers_count and automated_answers_count
  • [x] gallery: replace JOIN by this count
  • [ ] train: replace JOIN by this count
  • INSERT INTO wm_partial_answer:
  • [x] Update api/index.php
  • [x] Update view
  • wm_worker_answers:
    • [x] Update api/get_unclassified (DELETE)
    • [x] view (DELETE)
  • [ ] How to deal with deletions of users (and cascading deletions of answers)?
  • [ ] How to deal with deletions of workers (and cascading deletions of answers)?
  • [x] Write script to run over the database and calculate / update the counts (in case anything gets out of sync)

MartinThoma avatar Jun 14 '15 14:06 MartinThoma