DataFed icon indicating copy to clipboard operation
DataFed copied to clipboard

[DAPS-1522] Data Router Logging Improvements

Open megatnt1122 opened this issue 1 month ago • 1 comments

Ticket

#1522

Description

Logging improvements to data_router

Tasks

  • [ ] - A description of the PR has been provided, and a diagram included if it is a new feature.
  • [ ] - Formatter has been run
  • [ ] - CHANGELOG comment has been added
  • [ ] - Labels have been assigned to the pr
  • [ ] - A reviwer has been added
  • [ ] - A user has been assigned to work on the pr
  • [ ] - If new feature a unit test has been added

Summary by Sourcery

Improve observability and permission handling in the data router API endpoints.

New Features:

  • Add structured request logging for all key data router endpoints, including correlation IDs and request status.

Enhancements:

  • Centralize permission checks by replacing direct usage of the permissions module with equivalent helpers on the shared support library.
  • Capture and log success and failure details (including results and errors where appropriate) for create, update, export, dependency graph, locking, path resolution, transfer, allocation/owner change, and delete operations to aid debugging and auditing.

megatnt1122 avatar Nov 25 '25 14:11 megatnt1122

Reviewer's Guide

Refactors data_router to use centralized permission helpers and introduces structured request lifecycle logging (start/success/failure) across all data routes for better observability and correlation.

Sequence diagram for POST data/get with task initialization and logging

sequenceDiagram
    actor User
    participant ClientApp
    participant DataRouter
    participant GLib
    participant Logger
    participant ArangoDB
    participant GTasks

    User->>ClientApp: Request data download
    ClientApp->>DataRouter: POST /data/get?client=clientId

    DataRouter->>GLib: getUserFromClientID(clientId)
    GLib-->>DataRouter: client

    DataRouter->>Logger: logRequestStarted(client, correlationId, POST, data/get, Started)

    DataRouter->>ArangoDB: _executeTransaction(read metadata, validate ids)
    activate ArangoDB
    ArangoDB->>GLib: resolveDataID for each id
    GLib-->>ArangoDB: data ids
    ArangoDB-->>DataRouter: validated ids
    deactivate ArangoDB

    DataRouter->>GTasks: taskInitDataGet(client, path, encrypt, ids)
    GTasks-->>DataRouter: task descriptor result

    DataRouter-->>ClientApp: HTTP 200 result (task info)
    DataRouter->>Logger: logRequestSuccess(client, correlationId, POST, data/get, Success, result)

    alt Failure during transaction or task init
        DataRouter->>Logger: logRequestFailure(client, correlationId, POST, data/get, Failure, result, error)
        DataRouter->>GLib: handleException(error, res)
        GLib-->>ClientApp: Error response
    end

Class diagram for updated data_router modules and logging helpers

classDiagram
    class DataRouter {
        +post_create(req, res)
        +post_create_batch(req, res)
        +post_update(req, res)
        +post_update_batch(req, res)
        +post_update_md_err_msg(req, res)
        +post_update_size(req, res)
        +get_view(req, res)
        +post_export(req, res)
        +get_dep_graph_get(req, res)
        +get_lock(req, res)
        +get_path(req, res)
        +get_list_by_alloc(req, res)
        +post_get(req, res)
        +post_put(req, res)
        +post_alloc_chg(req, res)
        +post_owner_chg(req, res)
        +post_delete(req, res)
        -recordCreate(client, record, result)
        -recordUpdate(client, record, result)
    }

    class GLib {
        +getUserFromClientID(clientId)
        +getUserFromClientID_noexcept(clientId)
        +hasManagerPermProj(client, ownerId)
        +hasPermissions(client, resource, perms)
        +hasAdminPermObject(client, objectId)
        +getPermissions(client, resource, perms)
        +ensureAdminPermUser(client, userId)
        +ensureManagerPermProj(client, projId)
        +resolveDataID(id, client)
        +resolveDataCollID(id, client)
        +getObject(id, client)
        +hasPublicRead(dataId)
        +computeDataPath(loc, localOnly)
        +saveRecentGlobusPath(client, path, transferType)
        +handleException(error, res)
        <<constants>> PERM_CREATE
        <<constants>> PERM_WR_META
        <<constants>> PERM_WR_REC
        <<constants>> PERM_RD_REC
        <<constants>> PERM_RD_META
        <<constants>> PERM_RD_DATA
        <<constants>> TT_DATA_EXPORT
        <<constants>> TT_DATA_GET
        <<constants>> TT_DATA_PUT
    }

    class Logger {
        +logRequestStarted(client, correlationId, httpVerb, routePath, status, description)
        +logRequestSuccess(client, correlationId, httpVerb, routePath, status, description, extra)
        +logRequestFailure(client, correlationId, httpVerb, routePath, status, description, extra, error)
    }

    class ArangoDB {
        +_executeTransaction(config)
        +c_document(id)
        +d_document(id)
        +owner_firstExample(query)
        +loc_firstExample(query)
        +_update(id, patch)
        +_remove(id)
        +_query(query, bindVars)
    }

    class GTasks {
        +taskInitDataGet(client, path, encrypt, ids)
        +taskInitDataPut(client, path, encrypt, ids, check, srcRepoId)
        +taskInitRecAllocChg(client, projId, ids, opts)
        +taskInitRecOwnerChg(client, ids, collId, opts)
        +taskInitRecCollDelete(client, ids)
    }

    DataRouter --> GLib : uses
    DataRouter --> Logger : logs via
    DataRouter --> ArangoDB : executes transactions
    DataRouter --> GTasks : initializes tasks
    GLib --> ArangoDB : helper DB access
    GLib <.. DataRouter : centralized permissions and utilities
    Logger <.. DataRouter : request lifecycle logging

Flow diagram for generic data_router request lifecycle logging

flowchart TD
    A[Receive HTTP request
client, headers, body] --> B[Resolve client
getUserFromClientID or getUserFromClientID_noexcept]
    B --> C[logRequestStarted
client, correlationId,
httpVerb, routePath,
status Started,
description]
    C --> D{Handler uses
ArangoDB and helpers}

    D --> E[Build result data
or side effects only]
    E --> F[Send success response
res.send result or empty]
    F --> G[logRequestSuccess
client, correlationId,
httpVerb, routePath,
status Success,
description, extra]

    D --> H[Exception thrown
validation, permissions,
DB error, task error]
    H --> I[logRequestFailure
client, correlationId,
httpVerb, routePath,
status Failure,
description, extra,
error]
    I --> J[handleException
via g_lib.handleException]
    J --> K[Error response
status code and body]

    subgraph RetryLoop[Optional retry loop
for write conflict 1200]
        D --> L{Arango errorNum 1200
and retries left?}
        L -- yes --> D
        L -- no --> H
    end

File-Level Changes

Change Details Files
Add structured request lifecycle logging to all major data_router endpoints using a shared logger with correlation IDs and route metadata.
  • Import shared logger module and define a base path for data routes
  • For each endpoint, capture the client once at the top-level and log request start with HTTP verb, route path, status, and human-readable description
  • On successful completion, log request success including result or key output data where appropriate
  • On exceptions, log request failures with error details and any partial result data before delegating to existing exception handling
  • Ensure variables like client/result are declared in outer scope so they are available in both success and failure logging blocks
core/database/foxx/api/data_router.js
Unify permission checks to use g_lib helpers instead of the separate permissions module throughout data_router.
  • Replace permissions.hasManagerPermProj and permissions.hasPermissions usages with equivalent g_lib helper functions and constants
  • Update record update, view, lock, path, and listing logic to use g_lib.hasAdminPermObject, g_lib.getPermissions, and g_lib permission flags for consistency
  • Remove redundant permissions import now that all permission checks are routed through g_lib
  • Keep overall permission semantics and error handling the same while centralizing implementation
core/database/foxx/api/data_router.js

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an issue from a review comment by replying to it. You can also reply to a review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull request title to generate a title at any time. You can also comment @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in the pull request body to generate a PR summary at any time exactly where you want it. You can also comment @sourcery-ai summary on the pull request to (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the pull request to resolve all Sourcery comments. Useful if you've already addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull request to dismiss all existing Sourcery reviews. Especially useful if you want to start fresh with a new review - don't forget to comment @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

  • Contact our support team for questions or feedback.
  • Visit our documentation for detailed guides and information.
  • Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai[bot] avatar Nov 25 '25 14:11 sourcery-ai[bot]