drill icon indicating copy to clipboard operation
drill copied to clipboard

Drill 7882 + Drill 7883 - Fix LGTM Alerts in /common and /contrib

Open eevanwong opened this issue 3 years ago • 24 comments

DRILL-7882: Fix LGTM Alerts in common folder

#DRILL-7883: Fix LGTM Alerts in contrib folder

Done

https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/common/src/main/java/org/apache/drill/common/config/DrillProperties.java?sort=name&dir=ASC&mode=heatmap DrillProperties.java line 115 Added synchronized keyword to method

https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/common/src/main/java/org/apache/drill/common/HistoricalLog.java?sort=name&dir=ASC&mode=heatmap HistoricalLog.java Line 122 & 129 Suppressed b/c they were comments

in format-excel

https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-excel/src/main/java/org/apache/drill/exec/store/excel/ExcelBatchReader.java#x4f0d3a45123fb50b:1 excelbatchreader.java line 280 checked if datacell is null before switch case statement

in format-hdf5

https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/HDF5BatchReader.java?sort=name&dir=ASC&mode=heatmap HDF5BatchReader.java line 593 Changed {} to %s so that format call works

https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5ByteDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5ByteDataWriter.java line 71 Despite the fact the counter only runs after, and that if write() would run again it would return false due to the if statement beforehand. LGTM still picks up on it. Thus, I have used a try and catch statement as a way to avoid the alert (although I have no way of testing unless its on the main repo).

https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5DoubleDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5DoubleDataWriter.java line 69 Same as above

https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5FloatDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5FloatDataWriter.java line 69 same as above

https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5IntDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5IntDataWriter.java line 70 same as above

https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5LongDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5LongDataWriter.java line 69 same as above

https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5SmallIntDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5SmallIntDataWriter.java line 71 same as above

https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-hdf5/src/main/java/org/apache/drill/exec/store/hdf5/writers/HDF5TimestampDataWriter.java?sort=name&dir=ASC&mode=heatmap HDF5TimeSTampDataWriter line 48 same as above

in format-img

https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/contrib/format-image/src/main/java/org/apache/drill/exec/store/image/GenericMetadataDescriptor.java?sort=name&dir=ASC&mode=heatmap GenericMetadataDescriptor.java line 82,83,84 Converted type from Integer to int (didnt seem like there was a need for Integer Class)

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-image/src/main/java/org/apache/drill/exec/store/image/ImageDirectoryProcessor.java?sort=name&dir=ASC&mode=heatmap ImageDirectoryProcessor.java line 124 Suppressed, needs to be initialized with an arbitrary value (so keep it with null)

in format-maprdb

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/planner/index/MapRDBStatistics.java?sort=name&dir=ASC&mode=heatmap line 777 Added if (pattern != null) statement to avoid potential NPE error

line 801 same as above

line 807 if statement but for escape

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/BinaryTableGroupScan.java?sort=name&dir=ASC&mode=heatmap BinaryTableGroupScan.java line 190 To avoid int overflow, made numColumns a long variable

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/JsonTableGroupScan.java?sort=name&dir=ASC&mode=heatmap JsonTableGroupScan JsonTableGroupScan.java Line 380 boolean includes scanSpec != null (if spanspec is null then scanspec.getserializedfilter() would also be null)

line 493 suppressed because its comments

line 520 The 5th format call is for the estimated size, but there is no fn that gets/determines the estimated size... For now I put in "Can't determine estimated size" left it empty

line 527, 528, 541, 542, 632 All are comments, suppressed

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/MaprDBJsonRecordReader.java?sort=name&dir=ASC&mode=heatmap MaprDBJsonRecordReader.java Line 431 Changed suppression to suggestion that CodeQL ppl suggested document == null || document.asReader() == null ? ...

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBFormatPluginConfig.java?sort=name&dir=ASC&mode=heatmap MapRDBFormatPluginConfig.java Line 28 There is no equals function (there is impEquals, but is that the same thing?) but there is an overridden hashcode function. Again, I don't know what it's referring to.

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBGroupScan.java?sort=name&dir=ASC&mode=heatmap MapRDBGroupScan.java line 255 Format call had wrong syntax (for format(), use % not {} to take in args)

line 324 It already logs an error if null so there is no point in catching NPE, suppressed

In format-xml

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLBatchReader.java?sort=name&dir=ASC&mode=heatmap XMLBatchReader.java line 94 changed {} to %s

In storage-druid

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/storage-druid/src/main/java/org/apache/drill/exec/store/druid/DruidGroupScan.java?sort=name&dir=ASC&mode=heatmap DruidGroupScan.java line 201 Added L for long type specification

In storage-hbase

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseUtils.java?sort=name&dir=ASC&mode=heatmap HBaseUtils.java line 79 Imported java.utils.Arrays to directly convert filterBytes to a string so that it does not implicitly convert it in the error msg

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/config/HBasePersistentStore.java?sort=name&dir=ASC&mode=heatmap HBasePersistentStore.java line 201 suppressed b/c its in try catch statement

In storage-kudu

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/storage-kudu/src/main/java/org/apache/drill/exec/store/kudu/KuduGroupScan.java?sort=name&dir=ASC&mode=heatmap KuduGroupScan.java line 210 Added L for long type specification

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/udfs/src/main/java/org/apache/drill/exec/udfs/NetworkFunctions.java?sort=name&dir=ASC&mode=heatmap NetworkFunctions.java Line 434 Multiplied long with assignment so that there is no implicit conversion

TO-DO (revise) / ASK FOR HELP

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/JsonTableRangePartitionFunction.java?sort=name&dir=ASC&mode=heatmap JsonTableRangePartitionFunction.java Line 46 There is an overridden equals function but no hashcode function so I don't know what it's referring to Suppressed

In storage-kafka

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/storage-kafka/src/main/java/org/apache/drill/exec/store/kafka/KafkaPartitionScanSpec.java?sort=name&dir=ASC&mode=heatmap KafkaPartitionScanSpec.java line 25 There is no hashcode fn, if there is no fn, then there's no need for it, suppressed

In udfs

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/udfs/src/main/java/org/apache/drill/exec/udfs/CryptoFunctions.java?sort=name&dir=ASC&mode=heatmap CryptoFunctions.java line 288 Alert recommends using AES, but code already uses AES encryption. Maybe the alternatives are weak? Suppressed for now.

line 339 Same reason as above

In storage-splunk

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchReader.java?sort=name&dir=ASC&mode=heatmap SplunkBatchReader.java line 232 Very unsure. I am not aware of what the contents are so I can't really analyze this. Suppressed for now.

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBTableCache.java?sort=name&dir=ASC&mode=heatmap MapRDBTableCache.java Line 73 If table is null, I believe that it should just throw an NPE regardless (especially if its required in maprdb altho im not entirely sure), it's suppressed

perhaps i could use a logger.debug() or something like that

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/streams/StreamsFormatPluginConfig.java?sort=name&dir=ASC&mode=heatmap StreamsFormatpluginConfig.javas line 27 Suppressed, hashcode and impequals are both overridden. From what I understand about the alert is trying to find hashcode and equals fn, but because it cant find equals (which is just impequals) it assumes that its not overridden

https://lgtm.com/projects/g/apache/drill/snapshot/e2a0925dd18aacf3a5657acd738f89a63a3b8576/files/contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/TableFormatPluginConfig.java?sort=name&dir=ASC&mode=heatmap TableFormatPluginConfig.java Line 22 Same reason as above

https://lgtm.com/projects/g/apache/drill/snapshot/5ba93040059efc36da020f3cfd1ad31489e2e55e/files/common/src/main/java/org/apache/drill/common/exceptions/UserException.java?sort=name&dir=ASC&mode=heatmap UserException.java Line 481 Suppressed first b/c suggested solution was not valid, format fn needs those 2 parameters

Line 643 It's already in a try statement alert suppressed

Documentation

N/A

Testing

Due to how LGTM alerts work, if it isnt on the actual project, we can't see if it works.

eevanwong avatar Mar 23 '21 15:03 eevanwong

As a side note: I had not touched Drillbuf.java, or Drillfilesystem.java, but upon making this I found them. I can remove them if they're unnecessary.

eevanwong avatar Mar 23 '21 15:03 eevanwong

Errors in CI involve drillbuf.java, DrillFileSystem.java.

Fixed

eevanwong avatar Mar 23 '21 17:03 eevanwong

@eevanwong Thank you for the submission. I see a lot of suppressions here. Are you going to fix some of these, or just suppress the warnings?

cgivre avatar Mar 23 '21 21:03 cgivre

Hey, yeah there are a lot of suppressions. The next step for me is to fully understand all the ??? ones, then determining if I can fix. I just wanted to make a pr now so I didnt have to do it later.

I do think that I would keep certain ones suppressed regardless, specifically the documentation ones, as that has no particular impact on the actual code.

eevanwong avatar Mar 23 '21 23:03 eevanwong

Hey, could I get someones input on this? This is quite a small thing, but still something I want to make sure.

image

This is in JsonTableGroupScan.java, line 519. The alert is that there were only 4 arg to support the 5 that the format call was requesting for. The thing is, I can't find a fn in indexDesc that deals with finding the estimatedSize. For now, I put in some default string, but I would like to know if I could potentially find estimated size, or I could just delete that portion in the format call.

eevanwong avatar Mar 26 '21 15:03 eevanwong

The most common issues that I suppressed involved null values and potential NPE and override hashcode and equals. The thing with the null values is that, most times in the codebase, if the value was null it would already be dealt with (via null guard or logger); however, it still marks it as an alert. With hashcode and equals, there were sometimes a need for only hashcode and not equals, and vice versa. If one or the other were not present in the file, it would throw an alert. Also, in some cases, the equals function was there, but it was not named equals (it was named impEquals), which LGTM prob did not pick up, and threw the alert.

eevanwong avatar Mar 26 '21 17:03 eevanwong

@eevanwong I will check the questions in your comments tomorrow.

luocooong avatar Mar 27 '21 16:03 luocooong

Hey, could I get someones input on this? This is quite a small thing, but still something I want to make sure.

image

This is in JsonTableGroupScan.java, line 519. The alert is that there were only 4 arg to support the 5 that the format call was requesting for. The thing is, I can't find a fn in indexDesc that deals with finding the estimatedSize. For now, I put in some default string, but I would like to know if I could potentially find estimated size, or I could just delete that portion in the format call.

About all the alarms of the MapR module, you can ignore them directly (Maybe you only need to mark as comments in this PR).

luocooong avatar Mar 28 '21 08:03 luocooong

The most common issues that I suppressed involved null values and potential NPE and override hashcode and equals. The thing with the null values is that, most times in the codebase, if the value was null it would already be dealt with (via null guard or logger); however, it still marks it as an alert. With hashcode and equals, there were sometimes a need for only hashcode and not equals, and vice versa. If one or the other were not present in the file, it would throw an alert. Also, in some cases, the equals function was there, but it was not named equals (it was named impEquals), which LGTM prob did not pick up, and threw the alert.

luocooong avatar Mar 28 '21 08:03 luocooong

The most common issues that I suppressed involved null values and potential NPE and override hashcode and equals. The thing with the null values is that, most times in the codebase, if the value was null it would already be dealt with (via null guard or logger); however, it still marks it as an alert. With hashcode and equals, there were sometimes a need for only hashcode and not equals, and vice versa. If one or the other were not present in the file, it would throw an alert. Also, in some cases, the equals function was there, but it was not named equals (it was named impEquals), which LGTM prob did not pick up, and threw the alert.

I have noticed your question. Before resolve, I would like to ask you to research the following:

  1. How does LGTM.com work?
  2. Are all alarms meaningful on LGTM.com (Help focus your mind)?
  3. Could you please explain what is LGTM's suggestions for handle the issues (NPE, hashcode)?

luocooong avatar Mar 28 '21 08:03 luocooong

  1. According to their website, LGTM takes the codebase, "generates a database to represent the codebase", then uses codeQL to run queries on the database to generate alerts. They use data science to identify the errors (from analyzing a lot of other projects and finding common errors).

  2. It depends. Sometimes, alerts are a false-positive (incorrect); however, pertaining to drill, most of the alerts were meaningful as they were valid issues that needed to be fixed; however, some of them are not. For example,

image This already has a guard if the counter is greater than the length of the data, despite that, it still throws the alert, and so I tried to use a try and catch statement.

Also, like I said in my previous comment. Sometimes the value is initialized in the null value as an arbitrary value so that it can be used later. Just declaring it isn't enough as it causes an error.

  1. For hashcode, it recommends overriding both hashCode and equals methods to ensure they're consistent. The thing is, in most cases, there is no hashCode or equals method.

e.g image This results in this hashcode alert

image

For nulls, it recommends to ensure that the value is not null when its dereferenced. image

The thing, the alert is essentially explaining that the value might be null b/c it was initialized as so, but thats because it has to be arbitrary initialized so that it can be called in different statements.

eevanwong avatar Mar 29 '21 17:03 eevanwong

@eevanwong Thanks. I don't like the false-positive (incorrect). It can take a lot of time.

  1. About overriding hashCode and equals. I holding the review. If possible to count the total number of alerts related to this issues?
  2. Nullable. It's related to the heap memory of JVM (Garbage Collection Algorithm)? Ensure the variable is not null for ever?

luocooong avatar Apr 01 '21 12:04 luocooong

  1. Yes there are 17 alerts in total caused by this issue.

image

  1. I'm not sure if the null errors are related, the main issue on the alert was that it would point to an NPE if its value didnt change by the time it got to a statement that would use it.

I suppose I could try ensuring var is not null forever; however, it would just be towards another arbitrary value and it would be odd to do it depending on what type of variable it is (e.g File variable I wouldnt know what value to point it towards).

eevanwong avatar Apr 01 '21 16:04 eevanwong

@eevanwong Thanks for doing this. This job is not easy, because there are many things that can be done or not. Is possible to contact the LGTM.com request a reply about these issues and our thoughts (with the odd recommendations)?

luocooong avatar Apr 03 '21 13:04 luocooong

Yeah for sure I'll see what I can do.

eevanwong avatar Apr 03 '21 14:04 eevanwong

Hey so, a bit of an update, for hash code and equals, tuns out you in fact do need both.

Yes, this is a legitimate concern. The contract between hashCode and equals is that equal instances must have the same hashCode. The reverse is not necessary, but high quality hashCode algorithms should avoid this as much as possible.

According to javadoc for the equals method:

Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.

https://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#equals(java.lang.Object)

I suppose this might be a larger issue? I'm not very adept at Java so I myself am not very aware of proper conventions.

eevanwong avatar Apr 07 '21 13:04 eevanwong

  1. Yes, there is still a few changes I can make before moving onto the uncertain ones.
  2. Yeah I'll work on what I can for now (altho I'm a little busy during the weekend so it'll have to wait)
  3. I agree the overview is a bit messy, I'll reorganize it.

eevanwong avatar Apr 10 '21 15:04 eevanwong

Hi @eevanwong Could you please rebase with current master?

cgivre avatar May 21 '21 16:05 cgivre

Sorry, been awhile, I got hard sidetracked by school and whatnot. That being said I think I screwed up the rebase. :/

eevanwong avatar May 21 '21 21:05 eevanwong

I rebased successfully, when pushing to my original branch there were conflicts so I attempted to pull but then I had to resolve some merge conflicts.

eevanwong avatar May 21 '21 21:05 eevanwong

Yeah... If you can roll back what you just did, here's how I rebase.

git checkout master git fetch upstream git rebase upstream/master git checkout git rebase master

There may be an easier way, but that way get's the job done. -- C

On May 21, 2021, at 5:46 PM, Evan W. @.***> wrote:

Sorry, been awhile, I got hard sidetracked by school and whatnot. That being said I think I screwed up the rebase. :/

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/apache/drill/pull/2191#issuecomment-846278755, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKB7PSNRR2XICHYMCIUSLLTO3ICHANCNFSM4ZVOWFHQ.

cgivre avatar May 21 '21 21:05 cgivre

Ok, I think it worked

eevanwong avatar May 21 '21 22:05 eevanwong

Is this similar to recently merged: #2310 ? @estherbuchwalter

vdiravka avatar Oct 18 '21 14:10 vdiravka

I don't think that this overlaps with recently merged #2310. #2310 focused on adding Javadoc param tag descriptions and fixing Javadoc related errors.

estherbuchwalter avatar Oct 18 '21 15:10 estherbuchwalter

Note: LGTM is shutting down in December 2022. See https://github.blog/2022-08-15-the-next-step-for-lgtm-com-github-code-scanning/.

cgivre avatar Aug 23 '22 14:08 cgivre