[Bug][Github] Data too long for column 'name'
Search before asking
- [x] I had searched in the issues and found no similar issues.
What happened
I am getting error when collecting data from Github:
Error running pipeline 24. Wraps: (2) Error running task 167. | Wraps: (2) subtask generateDeployments ended unexpectedly | Wraps: (3) error adding result to batch (500) | Wraps: (4) Error 1406 (22001): Data too long for column 'name' at row 367 (500) | Wraps: (5) Error 1406 (22001): Data too long for column 'name' at row 367 | Error types: (1) *hintdetail.withDetail (2) *hintdetail.withDetail (3) *hintdetail.withDetail (4) *hintdetail.withDetail (5) *mysql.MySQLError Error types: (1) *hintdetail.withDetail (2) *errors.errorString
What do you expect to happen
The data collection must not fail. The length of the string can be trimmed if possible or increase the length of the column.
How to reproduce
Im collecting data from 3 repos and abd this is just happening in a especific project
Anything else
No response
Version
v1.0.2
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
The error happens because the name column in the cicd_deployments table is defined as varchar(255), but some GitHub deployment names exceed this length, causing MySQL to reject the insert and fail the pipeline. There is currently no trimming or validation for this field before insertion in the code, so any name longer than 255 characters will trigger this error.
You have two main options:
-
Increase the column length in your database schema and model (for example, to
varchar(500)), which allows longer names but may impact performance if the column is indexed. - Trim the name field in code before insertion, ensuring it never exceeds 255 characters, which avoids schema changes but may result in loss of information for very long names.
Both approaches are valid and have been used for similar issues in DevLake before (example). If you want a quick workaround, trimming the string in the extraction or insertion logic is safest for stability. If you prefer to preserve full names, update the schema and run the necessary migrations.
Let me know if you want step-by-step guidance for either approach!
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other
I need a permanent fix for this
For a permanent fix, you have two main options:
-
Increase the column length or change its type: Since the
namecolumn incicd_deploymentsis not indexed or a key, you can safely expand it fromvarchar(255)to something larger (likevarchar(500)orTEXT). This approach preserves the full deployment name and is the recommended solution when you want to avoid data loss. The DevLake community has used migration scripts for similar fixes in other plugins, such as changing column types toTEXTor increasing varchar length to handle long strings without errors (example, example). You would need to update the struct definition and add a migration script to apply the schema change. -
Trim the deployment name in code before insertion: This avoids schema changes and is safer if you want to guarantee stability, but it will cut off any names longer than 255 characters. This method is often used when the full value isn't critical for analysis (example, example).
Most users and maintainers prefer schema changes via migration scripts for permanent fixes, especially when the column is not indexed or a key. If you want, I can provide a sample migration script or code snippet for either approach—just let me know which you prefer.
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other
After changing the name column in cicd_deployments table to varchar(500) im getting this error
Error running pipeline 42. Wraps: (2) Error running task 287. | Wraps: (2) subtask Collect Issues ended unexpectedly | Wraps: (3) | | combined messages: | | { | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | ===================== | | graphql query got error | | } | Error types: (1) *hintdetail.withDetail (2) *hintdetail.withDetail (3) *errors.errorString Error types: (1) *hintdetail.withDetail (2) *errors.errorString
The schema change to cicd_deployments.name is not related to the "graphql query got error" in the Collect Issues subtask—these parts of DevLake are isolated in both code and data flow. The error is likely due to a known issue in the GitHub GraphQL plugin, where zero time values for nullable fields caused failures in data extraction and insertion. This was recently fixed in PR #8583, which normalizes zero-date fields to nil and resolves similar GraphQL errors.
If you're running v1.0.2, you may need to update to a newer release that includes this fix. If the error persists after updating, check for API rate limits, authentication problems, or malformed queries, and enable debug logging for more details. If you can share more detailed logs, I can help interpret them.
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other
I need to do something else? i've changed the name column in cicd_deployments to text and im getting the same error
You still get the error because the Go struct for CICDDeployment is defined with Name as type:varchar(255), so the application enforces the 255-character limit even if the database column is TEXT. To fix this permanently, update the Name field in backend/core/models/domainlayer/devops/cicd_deployment.go to gorm:"type:text", then create and run a migration script to change the column type in the database (using migrationhelper.ChangeColumnsType or similar). After that, rebuild and redeploy DevLake so the model and schema are in sync. If you have other tables with a name column (like cicd_tasks or cicd_deployment_commits), check and update those as well if needed.
Reference
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other
Im using helm to install devlake on my cluster
Can i talk to a human plz?
Hello @gabriel-suela if you need to extend the DB field, as Dosu mentioned you will need to alter the relevant struct field and generate the relevant migration. If you want to deploy this through the helm chart, you will need to create your own image, tag it accordingly and use that in the helm chart.
This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.
hi @petkostas
I've modified the code to consider only the first 255 characters and ignore the rest, which is acceptable for my current needs. I've tested this locally and confirmed it is working. Now, I need guidance on how to build the Helm chart from my modified code changes for testing.
Let's release a new version this week! I heard the Helm chart is having build issues, so I’ll take a look on Thursday if I find time to do so.