incubator-devlake icon indicating copy to clipboard operation
incubator-devlake copied to clipboard

[Bug][GitHub] Invalid data row (400)

Open iMichka opened this issue 11 months ago • 11 comments

Search before asking

  • [x] I had searched in the issues and found no similar issues.

What happened

I was doing some cleanup/setup in our connexions and hit an issue

I added a Github repository to a connexion. Then removed that repository and deleted the connexion.

I created a new connexion and wanted to add the repository in the "add data scope". This is triggering an error "invalid data row (400)" (looks like it's this line https://github.com/apache/incubator-devlake/blob/main/backend/helpers/pluginhelper/api/ds_scope_api_helper.go#L116-L118).

See the screenshot (sorry I had to anonymize the repo names as this is internal company stuff I can't disclose):

Image

There are other repositories I was able to add / move around with success, it's just this specific one.

I am connected through a github app and this repo has the same permissions as the others.

Is there any way I can help debug this further? Some extra logs? Maybe the deletion did not work well and there is some cruft remaining in the database (what should I look for?)

What do you expect to happen

Adding the repo should work as for the others

Also a clearer error message would be welcome (is this a github issue? a database issue?)

How to reproduce

It's hard to tell.

Adding/removing a datasource (and thus deleting the data), then adding it back might be a way to reproduce it. I can try to work on a way to "always" reproduce.

Anything else

I noticed the database we use was quite saturated at the moment the data source was removed / deleted (I'm using an AWS RDS t3.micro instance, it's a burstable instance and while removing multiple data sources it was really struggling: I propably hit a burst limit: so maybe my database content is slightly corrupted). But this is just a hypothesis, it's maybe something completely different that is broken/

Version

v1.0.2-beta5 and v1.0.2-beta7

Are you willing to submit PR?

  • [x] Yes I am willing to submit a PR!

Code of Conduct

iMichka avatar Apr 25 '25 15:04 iMichka

Hello @iMichka! I'm here to help you with bugs, answer your questions, and assist you in becoming a contributor. While you wait for a human maintainer, feel free to share any details, and I'll do my best to assist you!

The "invalid data row (400)" error you're encountering typically occurs when the data provided is not in the expected format. Specifically, in the PutMultiple function, each item in the data array is expected to be a map with string keys and interface values. If any item in the data array is not a map, this error is triggered [1].

Given your situation, it's possible that the deletion process did not complete successfully, leaving residual data that could be causing the error. Here are some steps you can take to debug and potentially resolve the issue:

  1. Check for Orphaned Records: Investigate the database for any orphaned records or inconsistencies related to the deleted repository. This might involve checking the _tool_github_repos table for any entries that should have been removed but weren't.

  2. Database Logs: Review the database logs for any errors or warnings that occurred during the deletion process. This might provide insights into what went wrong.

  3. Application Logs: Enable additional logging around the deletion and addition processes to capture more detailed information about the operations being performed and any errors encountered.

  4. Reproduce the Issue: If possible, try to reproduce the issue in a controlled environment to better understand the conditions that lead to the error.

  5. Database Health: Given that your database was under high load, consider checking its health and performance metrics. You might need to optimize or upgrade your database instance to handle the operations more smoothly.

Improving the error message to indicate whether the issue is with the GitHub API or the database would indeed be helpful. If you're willing to submit a PR, you could consider enhancing the error handling and logging to provide more informative messages.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar Apr 25 '25 15:04 dosubot[bot]

I had a look at _tool_github_repos and everything looks fine, the list of repos is correct. Not sure this is a database issue. I had a look at a few other tables and the data looks consistent

iMichka avatar Apr 28 '25 09:04 iMichka

@iMichka The error you ran into happened during parsing request data phase, can you open the tab and check the payload being sent to the server and see if the problematic repo data differ from other that worked fine?

klesh avatar Apr 29 '25 05:04 klesh

Right, I forgot to check that part, makes sense

Here is the payload when the error appears, looks like null is passed:

Image

When it works well the payload is fine:

Image

One odd thing I noticed

  • I have added 11 repositories with success
  • One of them has it's checkbox unchecked, even though it has been added successfully (this 11th repo is on the second page of the repo list on the connection page)
  • I can re-add that 11th repo with success, the checkbox never appears checked. Maybe it's related to the pagination?
  • When I want to add a 12th repo, that one fails with the error and the empty payload
Image

iMichka avatar Apr 29 '25 07:04 iMichka

I am sure this has to do with pagination. When I go to the second page of the repo list, and then click on "add data scope", all the repos from page 1 have their checkbox unchecked. So the search list does not take pagination into account

The following call seems wrong:

search-remote-scopes?search=mobility&page=1&pageSize=20

This should not use pagination IMHO (or the search list should have pagination implemented)

iMichka avatar Apr 29 '25 08:04 iMichka

I updated to v1.0.2-beta7 just to make sure this was not fixed, and the issue is also present on that version.

iMichka avatar Apr 29 '25 15:04 iMichka

I see, thanks for the input, will look into it soon. cc @d4x1

klesh avatar May 12 '25 03:05 klesh

@iMichka Have you ever tried to create a new project and config the specific repo? It looks like something wrong with the existing configurations.

d4x1 avatar May 12 '25 11:05 d4x1

@d4x1 here is what I tried:

  • I stopped the devlake containers
  • I dropped the devlake database and started with a clean, empty database
  • Restarted devlake containers which did a fresh database setup

Even then, I am still having the "Invalid data row (400)" error This is happening only on one specific repository. Other repositories are fine. All of them are part of the same GitHub Org, and I use the same Github App for all of them.

iMichka avatar May 13 '25 08:05 iMichka

From my understanding we have 2 different unrelated issues. The checkboxes UI/UX issue (due to pagination) is unrelated with the "Invalid data row (400)" error. The "Invalid data row (400)" error is only related to the data response being null in the payload.

iMichka avatar May 13 '25 08:05 iMichka

FYI I just deployed version 1.0.2-beta8 and it looks like the issue (invalid data row) is gone. I think #8394 fixed it, as it changed the way the org is handled in the form.

The wrong checkboxes being selected is still there (based on the pagination), but the invali data row issue is gone :)

iMichka avatar May 14 '25 09:05 iMichka

Problem resolved

klesh avatar Jul 08 '25 06:07 klesh