hop
hop copied to clipboard
[Bug]: Dimension lookup/update transform "Unable to retrieve data type of return fields because of an unexpected error"
Apache Hop version?
2.7
Java version?
openjdk 11.0.21 2023-10-17
Operating system
macOS
What happened?
Running a pipeline with the dimension lookup/updated transform results in an error and exception. The error message says that it could not retrieve the data type of the return fields although that is specified.
org.apache.hop.core.exception.HopTransformException: Unable to retrieve data type of return fields because of an unexpected error at java.lang.Thread.run (Thread.java:829) at org.apache.hop.pipeline.transform.RunThread.run (RunThread.java:55) at org.apache.hop.pipeline.transforms.dimensionlookup.DimensionLookup.processRow (DimensionLookup.java:99) at org.apache.hop.pipeline.transforms.dimensionlookup.DimensionLookupMeta.getFields (DimensionLookupMeta.java:286)
at org.apache.hop.pipeline.transforms.dimensionlookup.DimensionLookupMeta.getFields(DimensionLookupMeta.java:307)
at org.apache.hop.pipeline.transforms.dimensionlookup.DimensionLookup.processRow(DimensionLookup.java:99)
at org.apache.hop.pipeline.transform.RunThread.run(RunThread.java:55)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.NullPointerException at org.apache.hop.pipeline.transforms.dimensionlookup.DimensionLookupMeta.getFields(DimensionLookupMeta.java:286) ... 3 more 2024/01/12 13:39:24 - Lookup Building block.0 - ERROR: Unexpected error 2024/01/12 13:39:24 - Lookup Building block.0 - ERROR: org.apache.hop.core.exception.HopTransformException: 2024/01/12 13:39:24 - Lookup Building block.0 - Unable to retrieve data type of return fields because of an unexpected error 2024/01/12 13:39:24 - Lookup Building block.0 - at java.lang.Thread.run (Thread.java:829) 2024/01/12 13:39:24 - Lookup Building block.0 - at org.apache.hop.pipeline.transform.RunThread.run (RunThread.java:55) 2024/01/12 13:39:24 - Lookup Building block.0 - at org.apache.hop.pipeline.transforms.dimensionlookup.DimensionLookup.processRow (DimensionLookup.java:99) 2024/01/12 13:39:24 - Lookup Building block.0 - at org.apache.hop.pipeline.transforms.dimensionlookup.DimensionLookupMeta.getFields (DimensionLookupMeta.java:286) 2024/01/12 13:39:24 - Lookup Building block.0 - 2024/01/12 13:39:24 - Lookup Building block.0 - at org.apache.hop.pipeline.transforms.dimensionlookup.DimensionLookupMeta.getFields(DimensionLookupMeta.java:307) 2024/01/12 13:39:24 - Lookup Building block.0 - at org.apache.hop.pipeline.transforms.dimensionlookup.DimensionLookup.processRow(DimensionLookup.java:99) 2024/01/12 13:39:24 - Lookup Building block.0 - at org.apache.hop.pipeline.transform.RunThread.run(RunThread.java:55) 2024/01/12 13:39:24 - Lookup Building block.0 - at java.base/java.lang.Thread.run(Thread.java:829) 2024/01/12 13:39:24 - Lookup Building block.0 - Caused by: java.lang.NullPointerException 2024/01/12 13:39:24 - Lookup Building block.0 - at org.apache.hop.pipeline.transforms.dimensionlookup.DimensionLookupMeta.getFields(DimensionLookupMeta.java:286) 2024/01/12 13:39:24 - Lookup Building block.0 - ... 3 more 2024/01/12 13:39:24 - Lookup Building block.0 - child index = 3, logging object : org.apache.hop.core.logging.LoggingObject@3c86d77f parent=63896a50-b6ee-42c2-847a-e90562bac780
If helpful I am happy to provide more information
Issue Priority
Priority: 2
Issue Component
Component: Transforms
Currently I do the following work around: Open pipeline/transform in HOP. Change the type of return field to something else. Save. Change back to correct value. Save. Run in hop -> works - Strange
On which field is the return type incorrect? Is this a date/datetime field?
in the example above it is only one return field (see screenshot) and the type of that field is a String in Hop and a varchar(255) in the DB (Azure SQL Server).
In other pipelines I have the same issue with fields that I declared as Integer in HOP and which are int in the DB.
There may be a bug if the pipeline was created with a version < 2.7. When you return to the correct value, does the problem reappear in version 2.7?
I have the same problem. Here some more data in case it helps: If I do the work around described by matze42, it works but if execute a "Clear database cache" it fails again, exactly the same way.
I have tried with HOP 2.7.0 and HOP 2.5.0 executing (after tuning) workflows imported from kettle and 2.7.0 native workflows: same behaviour.
There may be a bug if the pipeline was created with a version < 2.7. When you return to the correct value, does the problem reappear in version 2.7?
Not 100% sure I get what you ask for. The pipeline was created before 2.7. After applying the workaround it works. Next month (because I do a monthly loading cycle) the problem is there again. In other words; the problem reoccurs even if I stay within 2.7.
I have the same problem. Here some more data in case it helps: If I do the work around described by matze42, it works but if execute a "Clear database cache" it fails again, exactly the same way.
Following your input I can confirm that clearing the database cache triggers the error for me as well. Sequence: pipeline works -> clear database cache -> pipeline does not work
I rescue this issue to see if it can be moved to priority 1. In my case, we have more than 100 transformations that we have imported from kettle to load our DWH. We have done semi-automatic and manual corrections after import, but this issue is blocking to migrate the load from kettle to HOP and start with testing in the pre-production environment. I am available to give more information if needed. Thanks a lot.
Belén
I suspect a problem loading/saving transformation metadata when changing the "Update dimension" mode. In the "Fields" tab, we switch from "Type of return field" to "Type of dimension" in the same column without adapting the values.
Step to reproduce:
- Disable "Update dimension"
- Create a field with a type of return field STRING
- Enable "Update dimension"
- Save the pipeline
- Reopen the pipeline
There are 3 points to consider:
- load/save metadata problem when changing dimension update
- if the return data type in the database is not recognized in lookup mode (community help needed).
- unnecessary "Type of return field" information never used in the code, which should not be encoded by user but comes from the database