FHIR icon indicating copy to clipboard operation
FHIR copied to clipboard

Split bulkdata webapp into a separate deployable

Open lmsurpre opened this issue 4 years ago • 5 comments

Is your feature request related to a problem? Please describe. The fhir-server image is quite large and some dependencies are duplicated between the two webapps (fhir-server and fhir-bulkdata).

Describe the solution you'd like Our thinking was to eventually split fhir-bulkimportexport-webapp into a separate project with its own docker image. This way, the ibm-fhir-server can stay its current size (or get a bit smaller), and we can put all the new heft into the new image and call that ibmcom/ibm-fhir-job-server or some such.

There is already a loose coupling between the main FHIR Server and the bulk data webapp in that the latter is invoked via the Liberty Batch feature (after fhir-operation-bulkdata creates the job).

Describe alternatives you've considered An alternative we considered is to load the spark and stocator depedencies into the server's shared lib directory (wlp/usr/shared/resources/lib). This would allow us to continue posting the fhir-bulkimportexport-webapp to bintray, but we'd still pay to cost in terms of the size (and complexity) of our installer zip and docker image.

Additional context Discussion at https://chat.fhir.org/#narrow/stream/212434-ibm/topic/Export.20to.20Parquet

lmsurpre avatar Aug 07 '20 18:08 lmsurpre

and enable "export to parquet" by default

No longer part of the deliverable.

prb112 avatar Feb 15 '22 20:02 prb112

Create a new docker image for fhir-bulkdata-webapp (named linuxforhealth/fhir-bulkdata) which is separate from the fhir-server image.

Under the demo directory, add the new container to the docker-compose environments and ensure the bulk data operations are still functional. Then do the same for the CI pipeline.

lmsurpre avatar Nov 08 '22 14:11 lmsurpre

notes:

  • currently these two webapps share a single fhir-server-config (and extension-search-parameters). how best to tease out the bulkdata configuration?
    • bulk import performs validation and also search parameter extraction
    • suggestion: keep a single fhir-server-config for now. in docker-compose, define a volume for this config and have both webapp containers use that for their config.)
  • implementation guides and related info are in userlib and that is used
    • suggestion: in docker-compose, define a volume for userlib and mount it in both containers
  • same for configDropins?
    • bulkdata.xml (and the db-specific flavors under configDropins/disabled) should be moved to the fhir-bulkdata-webapp
    • but datasources.xml (which defines the fhir server db connection info) will be shared (or duplicated)

lmsurpre avatar Nov 08 '22 14:11 lmsurpre

By default, the FHIR Bulkdata server will be installed with the JDBC persistence layer configured to use a single-tenant Embedded Derby database. The context lookup for jdbc/fhirbatchDB in fhir-operation-bulkdata(org.linuxforhealth.fhir.operation.bulkdata.client.action.batch.BatchCancelRequestAction.supportsDeleteJob()) will be removed since jdbc/fhirbatchDB will be configured for the FHIR Bulkdata server.

PrasannaHegde1 avatar Dec 08 '22 14:12 PrasannaHegde1

Can we remove the dependency from fhir-server to the IBM COS sdk as part of this split? Currently we have this in fhir-server/pom.xml:

        <dependency>
            <groupId>com.ibm.cos</groupId>
            <artifactId>ibm-cos-java-sdk</artifactId>
        </dependency>

Might we also be able to switch back to the "default" http lib from the azure blob sdk if we remove this IBM COS one?

lmsurpre avatar Dec 15 '22 14:12 lmsurpre