Sources synchronization improvements
Problem
At present Jitsu writes data from sources into destinations (data warehouses) without any SQL typecasts. Sometimes it leads to SQL types errors. We should improve it by adding typecasts on every insert. It might resolve some amount of type incompatible issues. Also, Jitsu should provide more information about data schema in task logs.
At the same time, we have a taskController which closes stalled tasks after server restart. TaskController sets FAILED status to such tasks. Such tasks should be marked CANCELED - It's more clear.
Solution
[ ] - Add SQL typecasts in all inserts based on existing table schema in all adapters (Postgres, MySQL). Consider that typecasts from mappings have more priority. (check if mapping typecasts are still used)
[ ] - Add additional task logs before every batch insert like: Inserting XX records into the YY table with columns: <here sorted list of destination table columns with SQL types>
[ ] - Change SQLAdapter interface: merge BulkInsert / BulkUpdate / Insert into one func: Upsert (implementation should make a decision to do it in transaction or not based on rows count)
[ ] - Add server name into Task meta information (in Redis) and add the following logic at the beginning of task_executor.go : startTaskController():
- get all tasks HeartBeat and check where server name equals to the current server (
appconfig.ServerName) - change the status of all tasks from the previous step to
CANCELEDand add a task log message about task canceling due to server restart.