ctsms icon indicating copy to clipboard operation
ctsms copied to clipboard

Docker environment

Open grahame opened this issue 4 years ago • 5 comments

Hello

I've been trying out Phoenix CTSMS and as part of that I have converted your install.sh VM-based process into a Docker environment. Everything seems to work with the exception of the Perl based bulk data loading system.

If you would like to have a look: https://github.com/adaptivehealthintelligence/docker-ctsms

If this is something you'd like to keep around, it could perhaps be turned into a PR?

grahame avatar Jul 27 '20 06:07 grahame

If this is something you'd like to keep around, it could perhaps be turned into a PR?

definitely, thanks a lot for this amazing contribution! i'll try+take a look at the bulk-processor part.

rkrenn avatar Jul 27 '20 09:07 rkrenn

while still learning docker, to me it looks the only missing parts for bulk-processsor would be:

  • tune the job commandlines, which by default look like this:

db_tool=/ctsms/dbtool.sh` #db_tool=C\:\\ctsms\\dbtool.bat ecrf_process_pl=perl /ctsms/bulk_processor/CTSMS/BulkProcessor/Projects/ETL/EcrfExporter/process.pl --config=config.job.cfg #ecrf_process_pl=perl "D\:\\configuration\\config-default\\bulk_processor\\CTSMS\\BulkProcessor\\Projects\\ETL\\EcrfExporter\\process.pl" inquiry_process_pl=perl /ctsms/bulk_processor/CTSMS/BulkProcessor/Projects/ETL/InquiryExporter/process.pl --config=config.job.cfg #inquiry_process_pl=perl "D\:\\configuration\\config-default\\bulk_processor\\CTSMS\\BulkProcessor\\Projects\\ETL\\InquiryExporter\\process.pl"

it requires to be able to launch a command provided by eg. the "bulk-processor" container from the "tomcat runtime" container. these commandlines can be overriden just by redefining them in a /ctsms/properties/ctsms-settings.properties file inside the "tomcat runtime" container. Once this works, all the features from a trial's "Job" tab become available. Think this is most important part, as this includes functionality such as eCRF data export.

  • any IP adresses defined in the .cfg and .yml files in the /ctsms/bulk-processor folder of the "bulk-processor" container should point to the url/IP of the restapi exposed by the "tomcat runtime" container. Fort this it should be enough to clone the https://github.com/phoenixctms/config-default repo (ie. https://github.com/phoenixctms/config-docker). That said, it might also make more sense to put the docker stuff into it's own repo.

  • optional: launch the fastcgi "signup" webapp at then end of the "bulk-processor" container, like launching catalina.sh "entrypoint" at the end of the tomcat:8 runtime. memcached is needed for that "signup" webapp only, not the main JEE appl.

rkrenn avatar Jul 29 '20 15:07 rkrenn

One challenge is that launchJob(...) in CoreUtil.java directly executes the perl application. We could potentially install the perl environment into the tomcat container, so that this is possible, but it is a little messy - in containerised applications it's nice to have some separation.

I suppose to add that indirection you'd want to use some sort of task queue. I don't know what this is like in the Java world – in Python I've used celery for this.

If it's too big of a change, we could put the bulk-processor into the tomcat container for now?

grahame avatar Jul 30 '20 03:07 grahame

i presume it should be as simple as configuring

ecrf_process_pl=perl /ctsms/bulk_processor/CTSMS/BulkProcessor/Projects/ETL/EcrfExporter/process.pl --config=config.job.cfg
inquiry_process_pl=perl /ctsms/bulk_processor/CTSMS/BulkProcessor/Projects/ETL/InquiryExporter/process.pl --config=config.job.cfg

to

ecrf_process_pl=docker run bulk-processor perl /ctsms/bulk_processor/CTSMS/BulkProcessor/Projects/ETL/EcrfExporter/process.pl --config=config.job.cfg
inquiry_process_pl=docker run bulk-processor perl /ctsms/bulk_processor/CTSMS/BulkProcessor/Projects/ETL/InquiryExporter/process.pl --config=config.job.cfg

but let me create a separate "docker" repo here first.

rkrenn avatar Jul 30 '20 07:07 rkrenn

ok, launching a container from within a container is possible in theory, but discouraged.

there already is a gearman impl part of bulk-processor for job distribution/queueing, which originated form another project. but i was trying to avoid that complexity here, but just launch the job processes directly. each job reports it's state updates via rest-api.

If it's too big of a change, we could put the bulk-processor into the tomcat container for now?

yes i think that's the most feasible appraoch for now. the bulk-processor container could be kept (renamed to "dancer runtime" as oppsoed to the (tomcat) "runtime" one), for running the optional and separate signup-webapp.

rkrenn avatar Jul 30 '20 09:07 rkrenn