augur icon indicating copy to clipboard operation
augur copied to clipboard

GitLab Data Collection Error for PR Metadata

Open sgoggins opened this issue 6 months ago • 1 comments

Please help us help you by filling out the following sections as thoroughly as you can.

Description:

Key Issue: Pretty clear there is some kind of logic error identifying the repo_id for a git_url for GitLab repositories. Likely buried inside the logic of augur/tasks/gitlab/merge_request_task.py and its calling of augur/tasks/util/collection_util.py

Core error:

psycopg2.errors.UndefinedFunction: operator does not exist: character varying = integer[]
LINE 3: WHERE augur_data.repo.repo_git = ARRAY[897,896,895,894,893,8...
                                       ^
HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.


Whole Error Message:

Traceback (most recent call last):
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/celery/app/trace.py", line 451, in trace_task
    R = retval = fun(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/celery/app/trace.py", line 734, in __protected_call__
    return self.run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/ai.chaoss/augur/tasks/gitlab/merge_request_task.py", line 306, in collect_merge_request_metadata
    process_mr_metadata(metadata_list, f"{owner}/{repo}: Mr metadata task", repo_id, logger, session)
  File "/home/sean/github/ai.chaoss/augur/tasks/gitlab/merge_request_task.py", line 339, in process_mr_metadata
    all_metadata.extend(extract_needed_mr_metadata(metadata, repo_id, pull_request_id, tool_source, tool_version, data_source))
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/ai.chaoss/augur/application/db/data_parse.py", line 1091, in extract_needed_mr_metadata
    head = {'sha': mr_dict['diff_refs']['head_sha'],
                   ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
TypeError: 'NoneType' object is not subscriptable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1969, in _exec_single_context
    self.dialect.do_execute(
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 922, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UndefinedFunction: operator does not exist: character varying = integer[]
LINE 3: WHERE augur_data.repo.repo_git = ARRAY[897,896,895,894,893,8...
                                       ^
HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/celery/app/trace.py", line 468, in trace_task
    I, R, state, retval = on_error(task_request, exc, uuid)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/celery/app/trace.py", line 379, in on_error
    R = I.handle_error_state(
        ^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/celery/app/trace.py", line 178, in handle_error_state
    return {
           ^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/celery/app/trace.py", line 231, in handle_failure
    task.on_failure(exc, req.id, req.args, req.kwargs, einfo)
  File "/home/sean/github/ai.chaoss/augur/tasks/init/celery_app.py", line 106, in on_failure
    self.augur_handle_task_failure(exc, task_id, repo_git, "core_task_failure")
  File "/home/sean/github/ai.chaoss/augur/tasks/init/celery_app.py", line 89, in augur_handle_task_failure
    repo = session.query(Repo).filter(Repo.repo_git == repo_git).one()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/orm/query.py", line 2798, in one
    return self._iter().one()  # type: ignore
           ^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/orm/query.py", line 2847, in _iter
    result: Union[ScalarResult[_T], Result[_T]] = self.session.execute(
                                                  ^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2306, in execute
    return self._execute_internal(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2188, in _execute_internal
    result: Result[Any] = compile_state_cls.orm_execute_statement(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/orm/context.py", line 293, in orm_execute_statement
    result = conn.execute(
             ^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1416, in execute
    return meth(
           ^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 516, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1639, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1848, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1988, in _exec_single_context
    self._handle_dbapi_exception(
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2343, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1969, in _exec_single_context
    self.dialect.do_execute(
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 922, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedFunction) operator does not exist: character varying = integer[]
LINE 3: WHERE augur_data.repo.repo_git = ARRAY[897,896,895,894,893,8...
                                       ^
HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.

[SQL: SELECT augur_data.repo.repo_id AS augur_data_repo_repo_id, augur_data.repo.repo_group_id AS augur_data_repo_repo_group_id, augur_data.repo.repo_git AS augur_data_repo_repo_git, augur_data.repo.repo_path AS augur_data_repo_repo_path, augur_data.repo.repo_name AS augur_data_repo_repo_name, augur_data.repo.repo_added AS augur_data_repo_repo_added, augur_data.repo.repo_type AS augur_data_repo_repo_type, augur_data.repo.url AS augur_data_repo_url, augur_data.repo.owner_id AS augur_data_repo_owner_id, augur_data.repo.description AS augur_data_repo_description, augur_data.repo.primary_language AS augur_data_repo_primary_language, augur_data.repo.created_at AS augur_data_repo_created_at, augur_data.repo.forked_from AS augur_data_repo_forked_from, augur_data.repo.updated_at AS augur_data_repo_updated_at, augur_data.repo.repo_archived_date_collected AS augur_data_repo_repo_archived_date_collected, augur_data.repo.repo_archived AS augur_data_repo_repo_archived, augur_data.repo.tool_source AS augur_data_repo_tool_source, augur_data.repo.tool_version AS augur_data_repo_tool_version, augur_data.repo.data_source AS augur_data_repo_data_source, augur_data.repo.data_collection_date AS augur_data_repo_data_collection_date, augur_data.repo.repo_src_id AS augur_data_repo_repo_src_id 
FROM augur_data.repo 
WHERE augur_data.repo.repo_git = %(repo_git_1)s]
[parameters: {'repo_git_1': [897, 896, 895, 894, 893, 892, 891, 890, 889, 888, 887, 886, 885, 884, 883, 882, 881, 880, 879, 878, 877, 876, 875, 874, 873, 872, 871, 870, 869, 868, ... (4077 characters truncated) ... 40, 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]}]
(Background on this error at: https://sqlalche.me/e/20/f405)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1969, in _exec_single_context
    self.dialect.do_execute(
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 922, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UndefinedFunction: operator does not exist: character varying = integer[]
LINE 3: WHERE augur_data.repo.repo_git = ARRAY[897,896,895,894,893,8...
                                       ^
HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/billiard/pool.py", line 362, in workloop
    result = (True, prepare_result(fun(*args, **kwargs)))
                                   ^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/celery/app/trace.py", line 649, in fast_trace_task
    R, I, T, Rstr = tasks[task].__trace__(
                    ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/celery/app/trace.py", line 572, in trace_task
    I, _, _, _ = on_error(task_request, exc, uuid)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/celery/app/trace.py", line 379, in on_error
    R = I.handle_error_state(
        ^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/celery/app/trace.py", line 178, in handle_error_state
    return {
           ^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/celery/app/trace.py", line 231, in handle_failure
    task.on_failure(exc, req.id, req.args, req.kwargs, einfo)
  File "/home/sean/github/ai.chaoss/augur/tasks/init/celery_app.py", line 106, in on_failure
    self.augur_handle_task_failure(exc, task_id, repo_git, "core_task_failure")
  File "/home/sean/github/ai.chaoss/augur/tasks/init/celery_app.py", line 89, in augur_handle_task_failure
    repo = session.query(Repo).filter(Repo.repo_git == repo_git).one()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/orm/query.py", line 2798, in one
    return self._iter().one()  # type: ignore
           ^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/orm/query.py", line 2847, in _iter
    result: Union[ScalarResult[_T], Result[_T]] = self.session.execute(
                                                  ^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2306, in execute
    return self._execute_internal(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2188, in _execute_internal
    result: Result[Any] = compile_state_cls.orm_execute_statement(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/orm/context.py", line 293, in orm_execute_statement
    result = conn.execute(
             ^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1416, in execute
    return meth(
           ^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 516, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1639, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1848, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1988, in _exec_single_context
    self._handle_dbapi_exception(
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2343, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1969, in _exec_single_context
    self.dialect.do_execute(
  File "/home/sean/github/virtualenv/ai/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 922, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedFunction) operator does not exist: character varying = integer[]
LINE 3: WHERE augur_data.repo.repo_git = ARRAY[897,896,895,894,893,8...
                                       ^
HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.

[SQL: SELECT augur_data.repo.repo_id AS augur_data_repo_repo_id, augur_data.repo.repo_group_id AS augur_data_repo_repo_group_id, augur_data.repo.repo_git AS augur_data_repo_repo_git, augur_data.repo.repo_path AS augur_data_repo_repo_path, augur_data.repo.repo_name AS augur_data_repo_repo_name, augur_data.repo.repo_added AS augur_data_repo_repo_added, augur_data.repo.repo_type AS augur_data_repo_repo_type, augur_data.repo.url AS augur_data_repo_url, augur_data.repo.owner_id AS augur_data_repo_owner_id, augur_data.repo.description AS augur_data_repo_description, augur_data.repo.primary_language AS augur_data_repo_primary_language, augur_data.repo.created_at AS augur_data_repo_created_at, augur_data.repo.forked_from AS augur_data_repo_forked_from, augur_data.repo.updated_at AS augur_data_repo_updated_at, augur_data.repo.repo_archived_date_collected AS augur_data_repo_repo_archived_date_collected, augur_data.repo.repo_archived AS augur_data_repo_repo_archived, augur_data.repo.tool_source AS augur_data_repo_tool_source, augur_data.repo.tool_version AS augur_data_repo_tool_version, augur_data.repo.data_source AS augur_data_repo_data_source, augur_data.repo.data_collection_date AS augur_data_repo_data_collection_date, augur_data.repo.repo_src_id AS augur_data_repo_repo_src_id 
FROM augur_data.repo 
WHERE augur_data.repo.repo_git = %(repo_git_1)s]
[parameters: {'repo_git_1': [897, 896, 895, 894, 893, 892, 891, 890, 889, 888, 887, 886, 885, 884, 883, 882, 881, 880, 879, 878, 877, 876, 875, 874, 873, 872, 871, 870, 869, 868, ... (4077 characters truncated) ... 40, 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]}]
(Background on this error at: https://sqlalche.me/e/20/f405)

How to reproduce:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'

Expected behavior: A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem. If your bug is related to the UI, you must include screenshots.

Log files Attach the relevant log files here. Server and installation logs can be found in the logs/ directory in the root augur/ directory, and the logs for each worker are stored in their respective directories. If the logs are pretty long (> 50ish lines or just use your best judgement) please use a Gist or a pastebin. These logs file are required if you would like help solving your issue.

Software versions:

  • Augur: (you can use pip show augur to find your version)
  • OS: (sw_vers for macOS, lsb_release -a on Linux)
  • Browser: (if applicable)

sgoggins avatar Jun 13 '25 19:06 sgoggins

Can you take a look at this https://github.com/chaoss/augur/pull/3190

Akshatb2006 avatar Jun 15 '25 09:06 Akshatb2006

this should have been closed as completed due to https://github.com/chaoss/augur/pull/3217

MoralCode avatar Oct 23 '25 17:10 MoralCode