[Bug][GitHub] Invalid data row (400)
Search before asking
- [x] I had searched in the issues and found no similar issues.
What happened
I was doing some cleanup/setup in our connexions and hit an issue
I added a Github repository to a connexion. Then removed that repository and deleted the connexion.
I created a new connexion and wanted to add the repository in the "add data scope". This is triggering an error "invalid data row (400)" (looks like it's this line https://github.com/apache/incubator-devlake/blob/main/backend/helpers/pluginhelper/api/ds_scope_api_helper.go#L116-L118).
See the screenshot (sorry I had to anonymize the repo names as this is internal company stuff I can't disclose):
There are other repositories I was able to add / move around with success, it's just this specific one.
I am connected through a github app and this repo has the same permissions as the others.
Is there any way I can help debug this further? Some extra logs? Maybe the deletion did not work well and there is some cruft remaining in the database (what should I look for?)
What do you expect to happen
Adding the repo should work as for the others
Also a clearer error message would be welcome (is this a github issue? a database issue?)
How to reproduce
It's hard to tell.
Adding/removing a datasource (and thus deleting the data), then adding it back might be a way to reproduce it. I can try to work on a way to "always" reproduce.
Anything else
I noticed the database we use was quite saturated at the moment the data source was removed / deleted (I'm using an AWS RDS t3.micro instance, it's a burstable instance and while removing multiple data sources it was really struggling: I propably hit a burst limit: so maybe my database content is slightly corrupted). But this is just a hypothesis, it's maybe something completely different that is broken/
Version
v1.0.2-beta5 and v1.0.2-beta7
Are you willing to submit PR?
- [x] Yes I am willing to submit a PR!
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
Hello @iMichka! I'm here to help you with bugs, answer your questions, and assist you in becoming a contributor. While you wait for a human maintainer, feel free to share any details, and I'll do my best to assist you!
The "invalid data row (400)" error you're encountering typically occurs when the data provided is not in the expected format. Specifically, in the PutMultiple function, each item in the data array is expected to be a map with string keys and interface values. If any item in the data array is not a map, this error is triggered [1].
Given your situation, it's possible that the deletion process did not complete successfully, leaving residual data that could be causing the error. Here are some steps you can take to debug and potentially resolve the issue:
-
Check for Orphaned Records: Investigate the database for any orphaned records or inconsistencies related to the deleted repository. This might involve checking the
_tool_github_repostable for any entries that should have been removed but weren't. -
Database Logs: Review the database logs for any errors or warnings that occurred during the deletion process. This might provide insights into what went wrong.
-
Application Logs: Enable additional logging around the deletion and addition processes to capture more detailed information about the operations being performed and any errors encountered.
-
Reproduce the Issue: If possible, try to reproduce the issue in a controlled environment to better understand the conditions that lead to the error.
-
Database Health: Given that your database was under high load, consider checking its health and performance metrics. You might need to optimize or upgrade your database instance to handle the operations more smoothly.
Improving the error message to indicate whether the issue is with the GitHub API or the database would indeed be helpful. If you're willing to submit a PR, you could consider enhancing the error handling and logging to provide more informative messages.
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other
I had a look at _tool_github_repos and everything looks fine, the list of repos is correct. Not sure this is a database issue. I had a look at a few other tables and the data looks consistent
@iMichka The error you ran into happened during parsing request data phase, can you open the tab and check the payload being sent to the server and see if the problematic repo data differ from other that worked fine?
Right, I forgot to check that part, makes sense
Here is the payload when the error appears, looks like null is passed:
When it works well the payload is fine:
One odd thing I noticed
- I have added 11 repositories with success
- One of them has it's checkbox unchecked, even though it has been added successfully (this 11th repo is on the second page of the repo list on the connection page)
- I can re-add that 11th repo with success, the checkbox never appears checked. Maybe it's related to the pagination?
- When I want to add a 12th repo, that one fails with the error and the empty payload
I am sure this has to do with pagination. When I go to the second page of the repo list, and then click on "add data scope", all the repos from page 1 have their checkbox unchecked. So the search list does not take pagination into account
The following call seems wrong:
search-remote-scopes?search=mobility&page=1&pageSize=20
This should not use pagination IMHO (or the search list should have pagination implemented)
I updated to v1.0.2-beta7 just to make sure this was not fixed, and the issue is also present on that version.
I see, thanks for the input, will look into it soon. cc @d4x1
@iMichka Have you ever tried to create a new project and config the specific repo? It looks like something wrong with the existing configurations.
@d4x1 here is what I tried:
- I stopped the devlake containers
- I dropped the devlake database and started with a clean, empty database
- Restarted devlake containers which did a fresh database setup
Even then, I am still having the "Invalid data row (400)" error This is happening only on one specific repository. Other repositories are fine. All of them are part of the same GitHub Org, and I use the same Github App for all of them.
From my understanding we have 2 different unrelated issues. The checkboxes UI/UX issue (due to pagination) is unrelated with the "Invalid data row (400)" error. The "Invalid data row (400)" error is only related to the data response being null in the payload.
FYI I just deployed version 1.0.2-beta8 and it looks like the issue (invalid data row) is gone. I think #8394 fixed it, as it changed the way the org is handled in the form.
The wrong checkboxes being selected is still there (based on the pagination), but the invali data row issue is gone :)
Problem resolved