cal.com
cal.com copied to clipboard
feat: add opt-in ready-to-deploy message queue (QStash+Next.js functions)
What does this PR do?
This PR introduces a robust message queuing system that is tightly integrated in the codebase and very extensible. It is a fully opt-in feature for production-like cal.com deployments that want to enhance the scalability, reliability, and performance of their application. It has already been integrated for all emails and webhooks.
Benefits:
- Extensibility: Fully opt-in, designed to cater to diverse deployment scenarios.
- Efficiency: Simplifies email dispatch processes with unified parameter treatment, less code bloat.
- Minimal code and infrastructure overhead (less than 500 lines with emails and webhook integrations) and uses mostly existing infrastructure
- Enhanced Security: Robust token-based message sending and signature checks for task runners ensure high security.
Short Technical Video and Demo
Short Technical Video and Demo
More info
This PR also showcases the integration for:
- [DONE] All emails send in the application by means of substituting the sendEmail calls with a dispatcher (we unified the parameters in a single object to simplify the treatment)
- [DONE] All Webhook handling
- [FUTURE] Integration with 3rd Party Calendar Event APIs could also be done
Even if done as part of OSShack to address https://osshack.com/bounty/25 (aka https://github.com/calcom/cal.com/issues/12556 or CAL-2764), the architecture and technological choices were purposely done to minimise the amount of changes required and ensure minimal bloating and required changes. This changes will also improve the reliability and the performance of the cal.com app in general.
We only add a single lightweight external dependency (Upstash QStash) as a message queue from a provider (Upstash) that is already used for another another feature in the code base (rate limiting). Furthermore, any similar queue system that has the capability to call URLs could also be integrated quite easily as an alternative.
As we use application functions as task runners, the infrastructure overhead is minimal and everything is integrated in the same code base. Because in production settings Next.js functions are serveless the system has great scalability. Similarly, existing tools for logging and performance monitoring (i.e. Sentry, Vercel logging and others) will work readily. In addition, there is an additional logging service for the message queue that is easily accesible from the QStash dashboard.
For testing, as the task runners for the different types of tasks are simply routes so they can be tested as any other Next.js pages route (app directory routes could also be used in the future). From a security standpoint, the sending of messages to the queue requires a token while the task runner checks the signature checking both the origin and message tampering, hence it is is quite solid.
Requirement/Documentation
- See requirements in CAL-2764 and https://github.com/calcom/cal.com/issues/12556
Type of change
- [x] Chore (refactoring code, technical debt, workflow improvements)
- [x] New feature (non-breaking change which adds functionality)
How should this be tested?
For end to end local testing an ngrok or similar tunnel is required. But if QSTASH_URL=localhost the task runners can be tested locally as any other Next.js function and also by means of using the app.
- Are there environment variables that should be set?
To test the QStash integration, QSTASH_URL and QSTASH_TOKEN must be set.
- What are the minimal test data to have?
- What is expected (happy path) to have (input and output)?
- Any other important info that could help to test that PR
Mandatory Tasks
- [x] Make sure you have self-reviewed the code. A decent size PR without self-review might be rejected.
Checklist
- We have read the contributing guide
- My code does follow the style guidelines of this project
- We have commented my code, particularly in hard-to-understand areas
- We have checked if my PR needs changes to the documentation to a certain degree
- We have checked if my changes generate no new warnings to a certain degree
- We haven't added tests abundant tests yet
- We haven't checked if new and existing unit tests pass locally with my changes
@pablodecm is attempting to deploy a commit to the cal Team on Vercel.
A member of the Team first needs to authorize it.
Thank you for following the naming conventions! 🙏 Feel free to join our discord and post your PR link.
New dependencies detected. Learn more about Socket for GitHub ↗︎
| Packages | Version | New capabilities | Transitives | Size | Publisher |
|---|---|---|---|---|---|
| @upstash/qstash | 2.3.0 | network, environment | +0 |
83.4 kB | hezarfen |
📦 Next.js Bundle Analysis for @calcom/web
This analysis was generated by the Next.js Bundle Analysis action. 🤖
This PR introduced no changes to the JavaScript bundle! 🙌
Post-OSShack TODOs (i.e. things to check before activating message queue in an actual production environment):
- Refactor TFunction (some internatinalization helper) in relevant data structures that cause problems with some emails to something serialisable
- More through testing in a preview/staging environment more similar to production (possibly some load testing)
With those two items (and maybe some unit testing albeit we are using the same functions that were being used without message queue so functionally it is the same than without the message queue) and of course some additional code review this lead to having a message queue in production environments really fast. Other improvements could include adding better/more specific TS typing to the dispatchers but that is less crucial.
This PR is being marked as stale due to inactivity.
@pablodecm thank you for the patience, we will look into this PR now after the holidays are over
Hi @PeerRich @keithwillcode @emrysal,
I am fully back to the (home) office this week and I will have some time the following weeks. Not sure how much of a priority the queue system is relative to other product initiatives but I will be very happy to see this merged and used in production so happy to help out if I can.
What is the current plan of action and timeline for review and addressing the known current limitations in this PR (https://github.com/calcom/cal.com/pull/12658#issuecomment-1837620543, i.e. mainly refactor so emails that included TFunction in the parameters are serialisable and additional testing)? How do you prefer to proceed?
In addition, some merge conflict resolutions is also needed due to recent changes in the email manager but that should be quite easy.
cc @lmiguelvargasf
Hey there, there is a merge conflict, can you take a look?
Moving back to draft until we take the message queue project head on in 3.9
This PR is being marked as stale due to inactivity.