drill
drill copied to clipboard
Drill 7882 + Drill 7883 - Fix LGTM Alerts in /common and /contrib
DRILL-7882: Fix LGTM Alerts in common folder
#DRILL-7883: Fix LGTM Alerts in contrib folder
Done
https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/common/src/main/java/org/apache/drill/common/config/DrillProperties.java?sort=name&dir=ASC&mode=heatmap DrillProperties.java line 115 Added synchronized keyword to method
https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/common/src/main/java/org/apache/drill/common/HistoricalLog.java?sort=name&dir=ASC&mode=heatmap HistoricalLog.java Line 122 & 129 Suppressed b/c they were comments
in format-excel
https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-excel/src/main/java/org/apache/drill/exec/store/excel/ExcelBatchReader.java#x4f0d3a45123fb50b:1 excelbatchreader.java line 280 checked if datacell is null before switch case statement
in format-hdf5
https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/HDF5BatchReader.java?sort=name&dir=ASC&mode=heatmap HDF5BatchReader.java line 593 Changed {} to %s so that format call works
https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5ByteDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5ByteDataWriter.java line 71 Despite the fact the counter only runs after, and that if write() would run again it would return false due to the if statement beforehand. LGTM still picks up on it. Thus, I have used a try and catch statement as a way to avoid the alert (although I have no way of testing unless its on the main repo).
https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5DoubleDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5DoubleDataWriter.java line 69 Same as above
https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5FloatDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5FloatDataWriter.java line 69 same as above
https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5IntDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5IntDataWriter.java line 70 same as above
https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5LongDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5LongDataWriter.java line 69 same as above
https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5SmallIntDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5SmallIntDataWriter.java line 71 same as above
https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5TimestampDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5TimeSTampDataWriter line 48 same as above
in format-img
https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-image/src/main/java/org/apache/drill/exec/store/image/GenericMetadataDescriptor.java?sort=name&dir=ASC&mode=heatmap GenericMetadataDescriptor.java line 82,83,84 Converted type from Integer to int (didnt seem like there was a need for Integer Class)
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-image/src/main/java/org/apache/drill/exec/store/image/ImageDirectoryProcessor.java?sort=name&dir=ASC&mode=heatmap ImageDirectoryProcessor.java line 124 Suppressed, needs to be initialized with an arbitrary value (so keep it with null)
in format-maprdb
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/planner/index/MapRDBStatistics.java?sort=name&dir=ASC&mode=heatmap line 777 Added if (pattern != null) statement to avoid potential NPE error
line 801 same as above
line 807 if statement but for escape
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/BinaryTableGroupScan.java?sort=name&dir=ASC&mode=heatmap BinaryTableGroupScan.java line 190 To avoid int overflow, made numColumns a long variable
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/JsonTableGroupScan.java?sort=name&dir=ASC&mode=heatmap JsonTableGroupScan JsonTableGroupScan.java Line 380 boolean includes scanSpec != null (if spanspec is null then scanspec.getserializedfilter() would also be null)
line 493 suppressed because its comments
line 520 The 5th format call is for the estimated size, but there is no fn that gets/determines the estimated size... For now I put in "Can't determine estimated size" left it empty
line 527, 528, 541, 542, 632 All are comments, suppressed
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/MaprDBJsonRecordReader.java?sort=name&dir=ASC&mode=heatmap
MaprDBJsonRecordReader.java
Line 431
Changed suppression to suggestion that CodeQL ppl suggested
document == null || document.asReader() == null ? ...
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBFormatPluginConfig.java?sort=name&dir=ASC&mode=heatmap MapRDBFormatPluginConfig.java Line 28 There is no equals function (there is impEquals, but is that the same thing?) but there is an overridden hashcode function. Again, I don't know what it's referring to.
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBGroupScan.java?sort=name&dir=ASC&mode=heatmap MapRDBGroupScan.java line 255 Format call had wrong syntax (for format(), use % not {} to take in args)
line 324 It already logs an error if null so there is no point in catching NPE, suppressed
In format-xml
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLBatchReader.java?sort=name&dir=ASC&mode=heatmap XMLBatchReader.java line 94 changed {} to %s
In storage-druid
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/storage-druid/src/main/java/org/apache/drill/exec/store/druid/DruidGroupScan.java?sort=name&dir=ASC&mode=heatmap DruidGroupScan.java line 201 Added L for long type specification
In storage-hbase
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseUtils.java?sort=name&dir=ASC&mode=heatmap HBaseUtils.java line 79 Imported java.utils.Arrays to directly convert filterBytes to a string so that it does not implicitly convert it in the error msg
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/config/HBasePersistentStore.java?sort=name&dir=ASC&mode=heatmap HBasePersistentStore.java line 201 suppressed b/c its in try catch statement
In storage-kudu
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/storage-kudu/src/main/java/org/apache/drill/exec/store/kudu/KuduGroupScan.java?sort=name&dir=ASC&mode=heatmap KuduGroupScan.java line 210 Added L for long type specification
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/udfs/src/main/java/org/apache/drill/exec/udfs/NetworkFunctions.java?sort=name&dir=ASC&mode=heatmap NetworkFunctions.java Line 434 Multiplied long with assignment so that there is no implicit conversion
TO-DO (revise) / ASK FOR HELP
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/JsonTableRangePartitionFunction.java?sort=name&dir=ASC&mode=heatmap JsonTableRangePartitionFunction.java Line 46 There is an overridden equals function but no hashcode function so I don't know what it's referring to Suppressed
In storage-kafka
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/storage-kafka/src/main/java/org/apache/drill/exec/store/kafka/KafkaPartitionScanSpec.java?sort=name&dir=ASC&mode=heatmap KafkaPartitionScanSpec.java line 25 There is no hashcode fn, if there is no fn, then there's no need for it, suppressed
In udfs
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/udfs/src/main/java/org/apache/drill/exec/udfs/CryptoFunctions.java?sort=name&dir=ASC&mode=heatmap CryptoFunctions.java line 288 Alert recommends using AES, but code already uses AES encryption. Maybe the alternatives are weak? Suppressed for now.
line 339 Same reason as above
In storage-splunk
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchReader.java?sort=name&dir=ASC&mode=heatmap SplunkBatchReader.java line 232 Very unsure. I am not aware of what the contents are so I can't really analyze this. Suppressed for now.
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBTableCache.java?sort=name&dir=ASC&mode=heatmap MapRDBTableCache.java Line 73 If table is null, I believe that it should just throw an NPE regardless (especially if its required in maprdb altho im not entirely sure), it's suppressed
perhaps i could use a logger.debug() or something like that
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/streams/StreamsFormatPluginConfig.java?sort=name&dir=ASC&mode=heatmap StreamsFormatpluginConfig.javas line 27 Suppressed, hashcode and impequals are both overridden. From what I understand about the alert is trying to find hashcode and equals fn, but because it cant find equals (which is just impequals) it assumes that its not overridden
https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/TableFormatPluginConfig.java?sort=name&dir=ASC&mode=heatmap TableFormatPluginConfig.java Line 22 Same reason as above
https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/common/src/main/java/org/apache/drill/common/exceptions/UserException.java?sort=name&dir=ASC&mode=heatmap UserException.java Line 481 Suppressed first b/c suggested solution was not valid, format fn needs those 2 parameters
Line 643 It's already in a try statement alert suppressed
Documentation
N/A
Testing
Due to how LGTM alerts work, if it isnt on the actual project, we can't see if it works.
As a side note: I had not touched Drillbuf.java, or Drillfilesystem.java, but upon making this I found them. I can remove them if they're unnecessary.
Errors in CI involve drillbuf.java, DrillFileSystem.java.
Fixed
@eevanwong Thank you for the submission. I see a lot of suppressions here. Are you going to fix some of these, or just suppress the warnings?
Hey, yeah there are a lot of suppressions. The next step for me is to fully understand all the ??? ones, then determining if I can fix. I just wanted to make a pr now so I didnt have to do it later.
I do think that I would keep certain ones suppressed regardless, specifically the documentation ones, as that has no particular impact on the actual code.
Hey, could I get someones input on this? This is quite a small thing, but still something I want to make sure.
This is in JsonTableGroupScan.java, line 519. The alert is that there were only 4 arg to support the 5 that the format call was requesting for. The thing is, I can't find a fn in indexDesc that deals with finding the estimatedSize. For now, I put in some default string, but I would like to know if I could potentially find estimated size, or I could just delete that portion in the format call.
The most common issues that I suppressed involved null values and potential NPE and override hashcode and equals. The thing with the null values is that, most times in the codebase, if the value was null it would already be dealt with (via null guard or logger); however, it still marks it as an alert. With hashcode and equals, there were sometimes a need for only hashcode and not equals, and vice versa. If one or the other were not present in the file, it would throw an alert. Also, in some cases, the equals function was there, but it was not named equals (it was named impEquals), which LGTM prob did not pick up, and threw the alert.
@eevanwong I will check the questions in your comments tomorrow.
Hey, could I get someones input on this? This is quite a small thing, but still something I want to make sure.
This is in JsonTableGroupScan.java, line 519. The alert is that there were only 4 arg to support the 5 that the format call was requesting for. The thing is, I can't find a fn in indexDesc that deals with finding the estimatedSize. For now, I put in some default string, but I would like to know if I could potentially find estimated size, or I could just delete that portion in the format call.
About all the alarms of the MapR module, you can ignore them directly (Maybe you only need to mark as comments in this PR).
The most common issues that I suppressed involved null values and potential NPE and override hashcode and equals. The thing with the null values is that, most times in the codebase, if the value was null it would already be dealt with (via null guard or logger); however, it still marks it as an alert. With hashcode and equals, there were sometimes a need for only hashcode and not equals, and vice versa. If one or the other were not present in the file, it would throw an alert. Also, in some cases, the equals function was there, but it was not named equals (it was named impEquals), which LGTM prob did not pick up, and threw the alert.
The most common issues that I suppressed involved null values and potential NPE and override hashcode and equals. The thing with the null values is that, most times in the codebase, if the value was null it would already be dealt with (via null guard or logger); however, it still marks it as an alert. With hashcode and equals, there were sometimes a need for only hashcode and not equals, and vice versa. If one or the other were not present in the file, it would throw an alert. Also, in some cases, the equals function was there, but it was not named equals (it was named impEquals), which LGTM prob did not pick up, and threw the alert.
I have noticed your question. Before resolve, I would like to ask you to research the following:
- How does LGTM.com work?
- Are all alarms meaningful on LGTM.com (Help focus your mind)?
- Could you please explain what is LGTM's suggestions for handle the issues (NPE, hashcode)?
-
According to their website, LGTM takes the codebase, "generates a database to represent the codebase", then uses codeQL to run queries on the database to generate alerts. They use data science to identify the errors (from analyzing a lot of other projects and finding common errors).
-
It depends. Sometimes, alerts are a false-positive (incorrect); however, pertaining to drill, most of the alerts were meaningful as they were valid issues that needed to be fixed; however, some of them are not. For example,
This already has a guard if the counter is greater than the length of the data, despite that, it still throws the alert, and so I tried to use a try and catch statement.
Also, like I said in my previous comment. Sometimes the value is initialized in the null value as an arbitrary value so that it can be used later. Just declaring it isn't enough as it causes an error.
- For hashcode, it recommends overriding both
hashCode
andequals
methods to ensure they're consistent. The thing is, in most cases, there is no hashCode or equals method.
e.g
This results in this hashcode alert
For nulls, it recommends to ensure that the value is not null when its dereferenced.
The thing, the alert is essentially explaining that the value might be null b/c it was initialized as so, but thats because it has to be arbitrary initialized so that it can be called in different statements.
@eevanwong Thanks. I don't like the false-positive (incorrect). It can take a lot of time.
- About overriding
hashCode
andequals
. I holding the review. If possible to count the total number of alerts related to this issues? - Nullable. It's related to the heap memory of JVM (Garbage Collection Algorithm)? Ensure the variable is not null for ever?
- Yes there are 17 alerts in total caused by this issue.
- I'm not sure if the null errors are related, the main issue on the alert was that it would point to an NPE if its value didnt change by the time it got to a statement that would use it.
I suppose I could try ensuring var is not null forever; however, it would just be towards another arbitrary value and it would be odd to do it depending on what type of variable it is (e.g File variable I wouldnt know what value to point it towards).
@eevanwong Thanks for doing this. This job is not easy, because there are many things that can be done or not. Is possible to contact the LGTM.com request a reply about these issues and our thoughts (with the odd recommendations)?
Yeah for sure I'll see what I can do.
Hey so, a bit of an update, for hash code and equals, tuns out you in fact do need both.
Yes, this is a legitimate concern. The contract between hashCode and equals is that equal instances must have the same hashCode. The reverse is not necessary, but high quality hashCode algorithms should avoid this as much as possible.
According to javadoc for the equals method:
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
https://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#equals(java.lang.Object)
I suppose this might be a larger issue? I'm not very adept at Java so I myself am not very aware of proper conventions.
- Yes, there is still a few changes I can make before moving onto the uncertain ones.
- Yeah I'll work on what I can for now (altho I'm a little busy during the weekend so it'll have to wait)
- I agree the overview is a bit messy, I'll reorganize it.
Hi @eevanwong Could you please rebase with current master?
Sorry, been awhile, I got hard sidetracked by school and whatnot. That being said I think I screwed up the rebase. :/
I rebased successfully, when pushing to my original branch there were conflicts so I attempted to pull but then I had to resolve some merge conflicts.
Yeah... If you can roll back what you just did, here's how I rebase.
git checkout master
git fetch upstream
git rebase upstream/master
git checkout
There may be an easier way, but that way get's the job done. -- C
On May 21, 2021, at 5:46 PM, Evan W. @.***> wrote:
Sorry, been awhile, I got hard sidetracked by school and whatnot. That being said I think I screwed up the rebase. :/
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/apache/drill/pull/2191#issuecomment-846278755, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKB7PSNRR2XICHYMCIUSLLTO3ICHANCNFSM4ZVOWFHQ.
Ok, I think it worked
Is this similar to recently merged: #2310 ? @estherbuchwalter
I don't think that this overlaps with recently merged #2310. #2310 focused on adding Javadoc param tag descriptions and fixing Javadoc related errors.
Note: LGTM is shutting down in December 2022. See https://github.blog/2022-08-15-the-next-step-for-lgtm-com-github-code-scanning/.