incubator-devlake icon indicating copy to clipboard operation
incubator-devlake copied to clipboard

[Bug][Module Name] github Pull Requests draft status not being procssed

Open jbsmith7741 opened this issue 10 months ago • 4 comments

Search before asking

  • [x] I had searched in the issues and found no similar issues.

What happened

In the table pull_requests the column is_draft is not being populated correctly. All values are either false or null.

In the _raw_github_api_pull_requests_reviews table column 'data' there is the raw json data returned from the github api which includes the value of "draft":true,

is_draft column in the _tool_github_pull_requests table for that PR is set to false.

What do you expect to happen

Any pull requests that is marked as draft should have the value of is_draft in pull_requests table set to true.

How to reproduce

  1. Create a Pull Request in a github repo that is being injested into the devlake.
  2. Change the status of Pull Request to draft.
  3. Have Devlake pull in and progress github data.
  4. Verify raw data was added in the _raw_github_api_pull_requests_reviews table
  5. Verify that is_draft is false in the _tool_github_pull_requests table
  6. Verify that is_draft is false in the pull_requests table

Anything else

No response

Version

v1.01

Are you willing to submit PR?

  • [x] Yes I am willing to submit a PR!

Code of Conduct

jbsmith7741 avatar May 01 '25 19:05 jbsmith7741

Hello @jbsmith7741! I'm Dosu, your friendly bot here to help with bugs, answer questions, and guide you on your journey to becoming a contributor. I'm ready to assist while we wait for a human maintainer. Let's dive into this together! 🚀

The issue you're experiencing with the is_draft column not being populated correctly might be related to how the data is being processed and converted from the GitHub API. The is_draft column in the _tool_github_pull_requests table is supposed to be populated directly from the IsDraft field in the models.GithubPullRequest struct, which is derived from the raw GitHub API data [1].

To address this issue, you might want to check the ConvertPullRequests function in the pr_convertor.go file to ensure that the conversion logic is correctly mapping the IsDraft field from the raw data to the domain-specific format. Additionally, verify that the extraction and conversion tasks are correctly processing the draft status from the raw JSON data.

If the conversion logic is correct, consider reviewing the data ingestion process to ensure that the draft status is being correctly captured and updated in the raw data tables before conversion. This might involve checking the extraction tasks and ensuring they are correctly handling updates to pull request statuses, such as changing from open to draft.

If you are willing to submit a PR, you could focus on these areas to ensure the draft status is correctly processed and reflected in the relevant tables.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar May 01 '25 19:05 dosubot[bot]

Thanks for reporting the bug, and looking to your PR.

klesh avatar May 12 '25 06:05 klesh

https://github.com/apache/incubator-devlake/pull/8438 This is the initial PR, but still needs testing.

@dosu Could you provide guidance on what tests need to be update and any other steps that can be used to verify this fix?

jbsmith7741 avatar May 13 '25 16:05 jbsmith7741

To verify the fix for the is_draft field in PR #8438, you should focus on updating and running the following tests:

  1. Update Snapshot CSV Files: Ensure that the snapshot CSV files used in the VerifyTable and VerifyTableWithOptions methods include the is_draft field with the correct values. Specifically, check the files ./snapshot_tables/_tool_github_pull_requests.csv and ./snapshot_tables/pull_requests.csv to ensure they reflect the expected draft status for pull requests [1].

  2. E2E Test Verification: The TestGithubPRsDataFlow function in pr_test.go already includes verification for the is_draft field. Ensure that this test is up-to-date and correctly verifies the is_draft field in both the GithubPullRequest and PullRequest tables [1].

  3. Run Tests: Execute the tests using the following commands to ensure that the changes are correctly verified:

    make test
    
  4. Verify Data Ingestion: Ensure that the data ingestion process correctly captures the draft status from the raw GitHub API data. This involves checking that the is_draft field is correctly set in the raw data tables before conversion [1].

By following these steps, you can ensure that the is_draft field is correctly processed and verified in the relevant tables.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar May 13 '25 16:05 dosubot[bot]

This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Jul 13 '25 00:07 github-actions[bot]

This issue has been closed because it has been inactive for a long time. You can reopen it if you encounter the similar problem in the future.

github-actions[bot] avatar Jul 20 '25 00:07 github-actions[bot]