incubator-devlake icon indicating copy to clipboard operation
incubator-devlake copied to clipboard

[Bug][Github] Data too long for column 'name'

Open gabriel-suela opened this issue 5 months ago • 12 comments

Search before asking

  • [x] I had searched in the issues and found no similar issues.

What happened

I am getting error when collecting data from Github:

Error running pipeline 24. Wraps: (2) Error running task 167. | Wraps: (2) subtask generateDeployments ended unexpectedly | Wraps: (3) error adding result to batch (500) | Wraps: (4) Error 1406 (22001): Data too long for column 'name' at row 367 (500) | Wraps: (5) Error 1406 (22001): Data too long for column 'name' at row 367 | Error types: (1) *hintdetail.withDetail (2) *hintdetail.withDetail (3) *hintdetail.withDetail (4) *hintdetail.withDetail (5) *mysql.MySQLError Error types: (1) *hintdetail.withDetail (2) *errors.errorString

What do you expect to happen

The data collection must not fail. The length of the string can be trimmed if possible or increase the length of the column.

How to reproduce

Im collecting data from 3 repos and abd this is just happening in a especific project

Image

Anything else

No response

Version

v1.0.2

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

gabriel-suela avatar Oct 02 '25 15:10 gabriel-suela

The error happens because the name column in the cicd_deployments table is defined as varchar(255), but some GitHub deployment names exceed this length, causing MySQL to reject the insert and fail the pipeline. There is currently no trimming or validation for this field before insertion in the code, so any name longer than 255 characters will trigger this error.

You have two main options:

  1. Increase the column length in your database schema and model (for example, to varchar(500)), which allows longer names but may impact performance if the column is indexed.
  2. Trim the name field in code before insertion, ensuring it never exceeds 255 characters, which avoids schema changes but may result in loss of information for very long names.

Both approaches are valid and have been used for similar issues in DevLake before (example). If you want a quick workaround, trimming the string in the extraction or insertion logic is safest for stability. If you prefer to preserve full names, update the schema and run the necessary migrations.

Let me know if you want step-by-step guidance for either approach!

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Oct 02 '25 15:10 dosubot[bot]

I need a permanent fix for this

gabriel-suela avatar Oct 05 '25 17:10 gabriel-suela

For a permanent fix, you have two main options:

  1. Increase the column length or change its type: Since the name column in cicd_deployments is not indexed or a key, you can safely expand it from varchar(255) to something larger (like varchar(500) or TEXT). This approach preserves the full deployment name and is the recommended solution when you want to avoid data loss. The DevLake community has used migration scripts for similar fixes in other plugins, such as changing column types to TEXT or increasing varchar length to handle long strings without errors (example, example). You would need to update the struct definition and add a migration script to apply the schema change.

  2. Trim the deployment name in code before insertion: This avoids schema changes and is safer if you want to guarantee stability, but it will cut off any names longer than 255 characters. This method is often used when the full value isn't critical for analysis (example, example).

Most users and maintainers prefer schema changes via migration scripts for permanent fixes, especially when the column is not indexed or a key. If you want, I can provide a sample migration script or code snippet for either approach—just let me know which you prefer.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Oct 05 '25 17:10 dosubot[bot]

After changing the name column in cicd_deployments table to varchar(500) im getting this error

Error running pipeline 42. Wraps: (2) Error running task 287. | Wraps: (2) subtask Collect Issues ended unexpectedly | Wraps: (3) | | combined messages: | | { | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | } | Error types: (1) *hintdetail.withDetail (2) *hintdetail.withDetail (3) *errors.errorString Error types: (1) *hintdetail.withDetail (2) *errors.errorString

gabriel-suela avatar Oct 06 '25 13:10 gabriel-suela

The schema change to cicd_deployments.name is not related to the "graphql query got error" in the Collect Issues subtask—these parts of DevLake are isolated in both code and data flow. The error is likely due to a known issue in the GitHub GraphQL plugin, where zero time values for nullable fields caused failures in data extraction and insertion. This was recently fixed in PR #8583, which normalizes zero-date fields to nil and resolves similar GraphQL errors.

If you're running v1.0.2, you may need to update to a newer release that includes this fix. If the error persists after updating, check for API rate limits, authentication problems, or malformed queries, and enable debug logging for more details. If you can share more detailed logs, I can help interpret them.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Oct 06 '25 13:10 dosubot[bot]

I need to do something else? i've changed the name column in cicd_deployments to text and im getting the same error

Image Image

gabriel-suela avatar Oct 06 '25 15:10 gabriel-suela

You still get the error because the Go struct for CICDDeployment is defined with Name as type:varchar(255), so the application enforces the 255-character limit even if the database column is TEXT. To fix this permanently, update the Name field in backend/core/models/domainlayer/devops/cicd_deployment.go to gorm:"type:text", then create and run a migration script to change the column type in the database (using migrationhelper.ChangeColumnsType or similar). After that, rebuild and redeploy DevLake so the model and schema are in sync. If you have other tables with a name column (like cicd_tasks or cicd_deployment_commits), check and update those as well if needed.
Reference

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Oct 06 '25 15:10 dosubot[bot]

Im using helm to install devlake on my cluster

gabriel-suela avatar Oct 06 '25 15:10 gabriel-suela

Can i talk to a human plz?

gabriel-suela avatar Oct 06 '25 16:10 gabriel-suela

Hello @gabriel-suela if you need to extend the DB field, as Dosu mentioned you will need to alter the relevant struct field and generate the relevant migration. If you want to deploy this through the helm chart, you will need to create your own image, tag it accordingly and use that in the helm chart.

petkostas avatar Oct 07 '25 15:10 petkostas

This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Dec 07 '25 00:12 github-actions[bot]

hi @petkostas

I've modified the code to consider only the first 255 characters and ignore the rest, which is acceptable for my current needs. I've tested this locally and confirmed it is working. Now, I need guidance on how to build the Helm chart from my modified code changes for testing.

veetmoradiya3628 avatar Dec 09 '25 11:12 veetmoradiya3628

Let's release a new version this week! I heard the Helm chart is having build issues, so I’ll take a look on Thursday if I find time to do so.

klesh avatar Dec 15 '25 07:12 klesh