parabol
parabol copied to clipboard
feat: embedder service
Description
Cleaned up for company, this is the embedder service that is being readied for merge.
While there are a few enhancements that I'd like to make in short order, this PR intends only to target our local development environments. When you review, keep your eye on that rather than asking, "is this ready for production?"
Enhancements to make after this PR merges:
- Added OpenAI embedding and GPT API support (rather than local dev services)
- Resume events that are stuck in a queued state
- Made fully event-driven (rather than use a polling loop)
👉 Consider started your review by reading the README, then beginning your code review here
Demo
📺 https://www.loom.com/share/524994e87dc34c5db360d3045d576e62?sid=789e6b1e-9ef2-4d18-9f96-ee4476426753
Testing scenarios
Requirements
-
[ ] Checkout and run https://github.com/ParabolInc/node-llama-cpp-text-generation-interface
-
[ ] Update your
.envfrom.env.example- [ ] Set:
POSTGRES_USE_PGVECTOR=true - [ ] Set:
AI_EMBEDDER_ENABLED - [ ] Set
AI_EMBEDDING_MODELS - [ ] Set:
AI_GENERATION_MODELS
- [ ] Set:
-
[ ] Add the
relatedDiscussionsfeature flag to any org you'd like retro discussions to be embedded (see here)
Tests
- [ ] Can run embedder service via
yarn dev- Observes rows being created in
EmbeddingsMetadata - Observes rows entering/leaving
EmbededingsJobQueue - Observes rows being created into configured embedding vector table, e.g.
Embeddings_ember_1
- Observes rows being created in
Final checklist
- [ ] I checked the code review guidelines
- [ ] I have added Metrics Representative as reviewer(s) if my PR invovles metrics/data/analytics related changes
- [ ] I have performed a self-review of my code, the same way I'd do it for any other team member
- [ ] I have tested all cases I listed in the testing scenarios and I haven't found any issues or regressions
- [ ] Whenever I took a non-obvious choice I added a comment explaining why I did it this way
- [ ] I added the label https://github.com/ParabolInc/parabol/labels/Skip%20Maintainer%20Review if the PR introduces only minor changes, does not contain any architectural changes or does not introduce any new patterns and I think one review is sufficient'
- [ ] PR title is human readable and could be used in changelog
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
Ok, @mattkrick – I think I addressed most of your review comments. I hope I did an ok job 🤞
The majority I fixed right here on this branch. For a few items I created a milestone and would love to involve others in shaping this up before using it with a general audience.
I'd like to get this merged (as it won't affect any releases until the service is enabled and deployed) and then work as a team to get the following implemented:
- #9432
- #9439
- #9433
- #9436
- #9437
- #9438
Please comment/edit the above issues with your input and/or add additional to the milestone, too :)
Could you take another pass on this PR?
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
just 1 little thing & a merge conflict. merge when ready!
This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.
Fixed up the yarn.lock conflict now. Just waiting on tests...