duplicate-code-detection-tool icon indicating copy to clipboard operation
duplicate-code-detection-tool copied to clipboard

feature request: update comments for different detections

Open shinyano opened this issue 1 year ago • 10 comments

I use the tool for different parts seperately in my project. My script looks like below and it checks 3 parts in my code:

name: "Duplicate code"

on:
  issue_comment:
    types:
      - created
permissions:
  contents: read
  pull-requests: write

jobs:
  duplicate-code-check:
    name: Check for duplicate code
    runs-on: ubuntu-latest
    if: github.event.issue.pull_request && contains(github.event.comment.body, 'run_duplicate_code_detection')
    steps:
      - name: Check for duplicate code(core)
        uses: platisd/duplicate-code-detection-tool@master
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}
          directories: "core"
          file_extensions: "java, py"
          one_comment: true

      - name: Check for duplicate code(datasource)
        if: always()
        uses: platisd/duplicate-code-detection-tool@master
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}
          directories: "dataSources"
          file_extensions: "java, py"
          one_comment: true

      - name: Check for duplicate code(session, shared)
        if: always()
        uses: platisd/duplicate-code-detection-tool@master
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}
          directories: "session, shared"
          file_extensions: "java, py"
          one_comment: true

I would like having one comment in PR for each part (that is 3 comments in total). And when the script runs again, it chooses to update these 3 comments.

For now, it can only update the latest comment. So it would make one comment and edit it twice. But if I choose to create comment every time, there will be too many comments.

shinyano avatar May 13 '24 08:05 shinyano

Interesting way of using the tool, I did not anticipate that. :sweat_smile:

To achieve what you describe, you would have to (as a user) provide some unique identifier yourself when invoking the action. This unique identifier will need to be part of the message so that the right message can be picked up. Not sure how the "UX" should be yet...

Maybe something like:

      - name: Check for duplicate code(datasource)
        if: always()
        uses: platisd/duplicate-code-detection-tool@master
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}
          directories: "dataSources"
          file_extensions: "java, py"
          one_comment: true
          unique_header_message_start: "## 📌 Duplicate code detection tool report (datasource)"

      - name: Check for duplicate code(session, shared)
        if: always()
        uses: platisd/duplicate-code-detection-tool@master
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}
          directories: "session, shared"
          file_extensions: "java, py"
          one_comment: true
          unique_header_message_start: "## 📌 Duplicate code detection tool report (session, shared)"

If the user doesn't actually provide an actually unique unique_header_message_start then things will not work correctly. Would that look like a good way of working for you? Or do you have any other suggestions of how this should be used?

platisd avatar May 13 '24 09:05 platisd

Or alternatively some unique_id that will be placed after the default message within parentheses, similar to how I have it above but the user would only have to provide what's within the parentheses and not the entire thing.

platisd avatar May 13 '24 09:05 platisd

Yeah I think providing a special message header is a good option. I'm considering generating unique message header automatically using target directories. When user set:

directories: "target"

Maybe the message header can be generated as ## 📌 Duplicate code detection tool report (target)

shinyano avatar May 14 '24 03:05 shinyano

Do you mean that the action should generate these identifiers without the users intervention?

platisd avatar May 14 '24 04:05 platisd

Yes, I think that's more convenient and easier.

shinyano avatar May 14 '24 07:05 shinyano

It is inded easier, however, only for this particular use case. I am thinking that having the specific directory names in the title works well f they are few, but if there would be let's say 10 folders there, the title would get huge and things would look ugly. :thinking:

platisd avatar May 14 '24 07:05 platisd

Yeah that's a problem... Do you think letting the user provide message header is a better idea? Like what you've shown earlier.

shinyano avatar May 14 '24 08:05 shinyano

Also, do you think it will be a good and practical idea to ignore short files? I got a lot of short Java interface classes highly similar to each other because the class is too short. For example:

public interface Function {

  FunctionType getFunctionType();

  MappingType getMappingType();

  String getIdentifier();
}

and

public interface RowMappingFunction extends Function {

  Row transform(Row row, FunctionParams params) throws Exception;
}

They appear to be 86.97% similar(with package and import, no more than 2 lines)

shinyano avatar May 15 '24 08:05 shinyano

In some cases I guess it would make sense. However the user would have to opt-in to enable such a feature (i.e. ignoring of short files). In other words, it shouldn't be on by default.

platisd avatar May 15 '24 08:05 platisd

That's an option yeah.

shinyano avatar May 15 '24 08:05 shinyano

Can you try out the branch in #33?

      - name: Check for duplicate code(datasource)
        if: always()
        uses: platisd/duplicate-code-detection-tool@user_configurable_message_header
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}
          directories: "dataSources"
          file_extensions: "java, py"
          one_comment: true
          unique_header_message_start: "## 📌 Duplicate code detection tool report (datasource)"

      - name: Check for duplicate code(session, shared)
        if: always()
        uses: platisd/duplicate-code-detection-tool@user_configurable_message_header
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}
          directories: "session, shared"
          file_extensions: "java, py"
          one_comment: true
          unique_header_message_start: "## 📌 Duplicate code detection tool report (session, shared)"

platisd avatar Jun 01 '24 11:06 platisd

It's working exactly as expected! Thank you so much for your work and big apology for the late late reply<3

shinyano avatar Jun 19 '24 14:06 shinyano