spring-cloud-dataflow icon indicating copy to clipboard operation
spring-cloud-dataflow copied to clipboard

Configurable DataSource property keys for spring cloud task applications

Open metalpalo opened this issue 3 years ago • 6 comments

Hi, Im using scdf version 2.9.2 and I tested it both locally and on kubernetes(also scheduling). What I found out that scdf server pass information about datasource and another properties via SPRING_APPLICATION_JSON environment when boot entryPointStyle.

Is possible to completely disable passing datasource properties into task application or at least control/filter them? We have some legacy task applications using primary datasource for business and secondary one for task handling. Another case is that task application can use different jdbc driver and it is overriden by scdf server value. I have read something about prefixes, but this was just suggestion, not implemented yet I think.

thanks

metalpalo avatar Feb 17 '22 12:02 metalpalo

All tasks launched from Spring Cloud Data Flow must implement Spring Cloud Task i.e. @EnableTask and have a data source that connects to the SCDF database.
So it can't be disabled However, we do have a sample of how you can support multiple data sources in a single task-boot application. https://github.com/spring-cloud/spring-cloud-task/tree/main/spring-cloud-task-samples/multiple-datasources. Such that you can select which datasource task should use.

cppwfs avatar Feb 28 '22 21:02 cppwfs

I understand that I can set up secondary datasource for task handling. That it is not a problem.

But SCDF will always send arguments like spring.datasource.* into my could application and override first datasource used to business logic right?

If so I would like to have full control on datasource for task handling, that means that task applications will have central config with datasource to scdf database

metalpalo avatar Mar 02 '22 09:03 metalpalo

A user wants the ability to set the establish the keys for data source properties for their task applications. Currently these keys are fixed: https://github.com/spring-cloud/spring-cloud-dataflow/blob/main/spring-cloud-dataflow-server-core/src/main/java/org/springframework/cloud/dataflow/server/service/impl/TaskServiceUtils.java#L133-L155

cppwfs avatar Mar 02 '22 13:03 cppwfs

As I mentioned it would be great to disable(configure) all datasource arguments passed to task application. I think that task could have full control how to connect to scdf database, for example via config server and so on. Further passing spring.datasource.driverClassName is very restrictive, when SCDF running with org.mariadb.jdbc.Driver, but task uses com.mysql.cj.jdbc.Driver then application has to include mariadb dependecy.

metalpalo avatar Mar 02 '22 14:03 metalpalo

There have been several requests along this area. We are going to investigate how to better handle multiple 'business database' instances as first class citizens. Thanks for the feedback

markpollack avatar May 12 '22 15:05 markpollack

I've re-read the issue and wanted to make some suggestions for a workaround that should be easier vs. revamping SCDF to read from multiple 'business databases' that also contain the spring cloud task/batch bookeeping tables, which is a larger design change, but one that we may eventually get to.

The spring cloud task autoconfiguration can be disabled and replaced with configuration that creates a custom implementation of the TaskConfigurer interface, say DataFlowTaskConfigurer. This implementation would read properties such as

spring.scdf.datasource.url
spring.scdf.datasource.driverClassName
spring.scdf.datasource.password
spring.scdf.datasource.username

This would setup the task related infrastructure. SCDF (as it stands now) would still be sending in the 'bookeeping' database and not the 'business' database but there are two ways to override that w/o a change to SCDF

  • Use config server, values from config server have a higher priority than those from SPRING_APPLICATION_JSON
  • Pass in the standard spring boot database properties as command line arguments when launching the task, since command line arguments have higher priority than SPRING_APPLICATION_JSON.

One would have to pass in the spring.scdf.datasource.* properties as well - and since there are credentials involved, using config server for that would also be recommended.

markpollack avatar Aug 09 '22 14:08 markpollack