dataprep
dataprep copied to clipboard
Feature Proposal: use git clone to download the configs of connector
Summary
Currently connector downloads configs from github using raw.githubusercontent.com, however, that one has a 5 mins content cache. This means if we just pushed something to github, connector will get the newest hash but obsolete content. One solution is do a full git clone of the config repo instead of using HTTP to download the content.
Design-level Explanation Actions
- [ ] How do we avoid introducing a new dependency to dataprep?
- [ ] How do we allow concurrent access to different branches of the config repo?
Design-level Explanation
There should be nothing to change on the user interface, or the worst case, a user should have git installed on his computer.
Implementation-level Explanation
The new config workflow will be:
- check whether the config repo exists in the tmp folder
- if not, git clone the config repo
- if the user specifies
update
, do a git pull. - checkout to the specific branch, if not specified, master branch.
The biggest concern here is if we do checkout, we forbid concurrent access to different branches of the config.
Rational and Alternatives
Prior Art
Future Possibilities
Implementation-level Actions
Additional Tasks
- [x] This task is put into a correct pipeline (Development Backlog or In Progress).
- [x] The label of this task is setting correctly.
- [x] The issue is assigned to the correct person.
- [x] The issue is linked to related Epic.
- [ ] The documentation is changed accordingly.
- [ ] Tests are added accordingly.