plural icon indicating copy to clipboard operation
plural copied to clipboard

Feat: ai from s3

Open rauerhans opened this issue 1 year ago • 1 comments

Summary

After the dagster ETL for the plural ai vector store index is validated to work in prod, this should be merged to get the data from the pipeline sink at s3. I left the old scraper as is for now, we can certainly remove it when everything works as intended.

The storage context is now pulled from s3 so the main.py script needs to know where to find it and how to authenticate.

  • Auth: IRSA should work, otherwise you'll need to set the standard AWS env vars:
    • AWS_ACCESS_KEY_ID
    • AWS_SECRET_ACCESS_KEY
  • Path: The script expects the AWS path in PLURAL_AI_INDEX_S3_PATH in the format <bucket-name>/<path>. Defaults to plural-assets/dagster/plural-ai/vector_store_index

To be safe AWS_DEFAULT_REGION should be set to the region of the bucket.

Labels

Test Plan

Checklist

  • [ ] If required, I have updated the Plural documentation accordingly.
  • [ ] I have added tests to cover my changes.
  • [ ] I have added a meaningful title and summary to convey the impact of this PR to a user.
  • [ ] I have added relevant labels to this PR to help with categorization for release notes.

rauerhans avatar Aug 25 '23 13:08 rauerhans

Easy and customizable dashboards for your build system. Learn more about Stoat ↗︎

Static Hosting

Name Link Commit Status
api-coverage Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅
rtc-coverage Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅
core-coverage Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅
cron-coverage Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅
email-coverage Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅
worker-coverage Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅
api-test-results Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅
graphql-coverage Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅
rtc-test-results Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅
core-test-results Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅
cron-test-results Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅
email-test-results Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅
worker-test-results Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅
graphql-test-results Visit e7bd4f24e7568840931b3524da84a1d4dd1c3840 ✅

Job Runtime

job runtime chart

debug

stoat-app[bot] avatar Aug 25 '23 13:08 stoat-app[bot]