datahub icon indicating copy to clipboard operation
datahub copied to clipboard

Circleci performance improvement

Open dippindots opened this issue 3 years ago • 4 comments

This improves the performance by doing these:

  • Use large resource
  • Enable parallelism (3 parallel instances)
  • Separate the all-in-one validation command into individual test commands (one for each study)

dippindots avatar Oct 13 '22 06:10 dippindots

@inodb Thanks for the comment, just updated

dippindots avatar Oct 27 '22 16:10 dippindots

@dippindots thanks so much for fixing this!

Small thing, the log shows multiple warning messages and exit status messages for the same study. Is that expected when running processes in parallel?

Screen Shot 2022-10-27 at 2 36 12 PM

rmadupuri avatar Oct 27 '22 18:10 rmadupuri

@rmadupuri This is a little bit confusing, it's because we are running multiple validation tasks in the background at the same time, and each task would produce information messages like this. So messages from different studies mixed together at here, that's why we can see multiple validation messages together(I can't do anything about it because it's coming from the study validation script). After we merge this pr, we should only looking at the summary messages at the bottom.

dippindots avatar Oct 27 '22 19:10 dippindots

@rmadupuri good point! @dippindots it should be possible to redirect the output if you add some statement like > myjob_output.txt. Then at the end you could either show the output altogether by using e.g. cat or uploading it as an artifact to the job, so people can manually inspect it themselves

inodb avatar Oct 27 '22 20:10 inodb