DataFed icon indicating copy to clipboard operation
DataFed copied to clipboard

Austin token encryption

Open JoshuaSBrown opened this issue 8 months ago • 5 comments

PR Description

Tasks

  • [x] - A description of the PR has been provided, and a diagram included if it is a new feature.
  • [ ] - Formatter has been run
  • [ ] - CHANGELOG comment has been added
  • [x] - Labels have been assigned to the pr
  • [x] - A reviwer has been added
  • [x] - A user has been assigned to work on the pr
  • [x] - If new feature a unit test has been added

Summary by Sourcery

Implement AES-256-CBC token encryption support and integrate it across the system

New Features:

  • Add CipherEngine component for token encryption/decryption using OpenSSL
  • Encrypt access and refresh tokens in DatabaseAPI and decrypt them in TaskWorker
  • Extend Foxx user_router token/set and token/get endpoints to include IV and length parameters

Enhancements:

  • Add readFile utility to load encryption key from file
  • Update shell scripts to install and configure OpenSSL, and adjust dependency installation for libssl/libcrypto
  • Modify Dockerfile for debug build configuration and include valgrind

Build:

  • Include OpenSSL in CMake build, link common library against libssl and libcrypto
  • Update copy_dependency.sh and dependency_install_functions.sh to handle OpenSSL setup

Tests:

  • Add unit tests for CipherEngine and TaskWorker token encryption/decryption
  • Update Foxx microservice tests to validate IV and length query parameters for tokens

Chores:

  • Apply formatting and whitespace cleanup across various shell scripts and CMake files

JoshuaSBrown avatar May 15 '25 20:05 JoshuaSBrown

Reviewer's Guide

This PR integrates AES-256-CBC token encryption across the stack by introducing a CipherEngine utility, updating core C++ code to decrypt tokens on retrieval and encrypt on storage, modifying TaskWorker to prepare tokens for tasks, and extending build/test scripts and Foxx services to support IV and length metadata.

File-Level Changes

Change Details Files
Add CipherEngine for AES-CBC encryption and Base64 encoding
  • Implement CipherEngine.cpp/h wrapping OpenSSL AES-256-CBC
  • Add encode64/decode64 helpers and error handling
  • Implement readFile() to load key from disk
  • Add tests for encryption/decryption (test_CipherEngine)
common/source/CipherEngine.cpp
common/include/common/CipherEngine.hpp
common/source/Util.cpp
common/include/common/Util.hpp
common/tests/unit/test_CipherEngine.cpp
Encrypt tokens in DatabaseAPI get/set
  • Load 32-byte key file in userGetAccessToken
  • Decrypt incoming access/refresh tokens before returning
  • Encrypt tokens in userSetAccessToken and send iv/len as query params
  • Update CMakeLists to link OpenSSL
core/server/DatabaseAPI.cpp
CMakeLists.txt
common/CMakeLists.txt
Prepare and decrypt tokens in TaskWorker
  • Add tokenNeedsUpdate() and prepToken() helpers
  • Decrypt tokens before launching raw data transfers
  • Use CipherEngine to manage IV/len metadata
core/server/TaskWorker.cpp
core/server/TaskWorker.hpp
core/server/tests/unit/test_TaskWorker.cpp
Propagate IV and length fields through Foxx and tests
  • Extend Foxx user_router to accept access_len/access_iv/refresh_len/refresh_iv
  • Update tasks.js and models to include iv/len fields
  • Update unit tests and fixtures to include new params
  • Modify build/install scripts to install OpenSSL
core/database/foxx/api/user_router.js
core/database/foxx/api/tasks.js
core/database/foxx/api/models/*
core/database/foxx/tests/*
scripts/dependency_install_functions.sh
scripts/install_core_dependencies.sh
scripts/install_dependencies.sh

Possibly linked issues

  • #123: PR implements symmetric encryption for tokens by adding encryption/decryption logic and updating token handling.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an issue from a review comment by replying to it. You can also reply to a review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull request title to generate a title at any time. You can also comment @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in the pull request body to generate a PR summary at any time exactly where you want it. You can also comment @sourcery-ai summary on the pull request to (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the pull request to resolve all Sourcery comments. Useful if you've already addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull request to dismiss all existing Sourcery reviews. Especially useful if you want to start fresh with a new review - don't forget to comment @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

  • Contact our support team for questions or feedback.
  • Visit our documentation for detailed guides and information.
  • Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai[bot] avatar May 15 '25 20:05 sourcery-ai[bot]

Ok now we are also seeing this:

2025-07-03T19:58:30.295217Z ERROR /datafed/source/core/server/Config.cpp:loadRepositoryConfig:47 { "thread_name": "core_server-repoCacheThread", "thread_id": "2", "message": "Ignoring repo/datafed-at-hot-potato - invalid public key: " }

Looks like the zeroMQ keys are behaving oddly.

JoshuaSBrown avatar Jul 03 '25 20:07 JoshuaSBrown

It looks like that failed because curl was missing from python client machine which was running the tests.

Warning: Found 2 installation flag files with prefix '.nvm_installed-'. Cleaning up...
Removed all existing installation flag files with prefix '.nvm_installed-'.
/home/gitlab-runner/builds/gYxDkX87B/0/dlsw/datafed/datafed/scripts/dependency_install_functions.sh: line 430: /shared/install/bin/curl: No such file or directory

JoshuaSBrown avatar Jul 03 '25 20:07 JoshuaSBrown

@nedvedba can you take a look at this and give some input.

58.72 Setting up libboost-mpi-python-dev (1.74.0.3) ...
58.72 Setting up libboost-mpi-dev (1.74.0.3) ...
58.73 Setting up libboost-all-dev (1.74.0.3) ...
58.74 Processing triggers for libc-bin (2.36-9+deb12u10) ...
58.90 DEPENDENCIES (setuptools setuptools sphinx sphinx-rtd-theme sphinx-autoapi)
58.90 /datafed/source/scripts/dependency_install_functions.sh: line 159: python3.9: command not found
------
Dockerfile.dependencies:51
--------------------
  49 |     # Core dependencies
  50 |     COPY ./scripts/install_core_dependencies.sh ${BUILD_DIR}/scripts/
  51 | >>> RUN DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC ${BUILD_DIR}/scripts/install_dependencies.sh -a -r -z -w
  52 |     
  53 |     # Web dependencies
--------------------
ERROR: failed to solve: process "/bin/bash -c DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC ${BUILD_DIR}/scripts/install_dependencies.sh -a -r -z -w" did not complete successfully: exit code: 127

I'm not sure why the python interpreter is not being found here.

https://code.ornl.gov/dlsw/datafed/datafed/-/jobs/3353972

JoshuaSBrown avatar Jul 03 '25 21:07 JoshuaSBrown

To test this with existing state. The following was done.

  1. Devel branch was checked out
  2. cd compose/all
  3. ./build_all_images.sh script was run
  4. docker compose up
  5. User logs in, and a repository, allocatino, record, and transfer were conducted. output from arangosh was grabbed to show user creds and tokens.
  6. docker compose down - was then run
  7. the foxx volume was removed - this is necessary to ensure it is reinstalled when we switch to Austin's branch
  8. switched to austin's branch
  9. rebuild images
  10. docker compose up
  11. Login - verify no erros in log outputs and that iv and required parameters work. Transfer on a new record was run succuessfully.

JoshuaSBrown avatar Jul 16 '25 16:07 JoshuaSBrown