[Refactor][helper] Declarative ApiCollector
What and why to refactor
Currently, we have 2 types of ApiCollectors
-
ApiCollector- Stateless API collection helper -
ApiCollectorStateManager- Stateful API collection helper on top ofApiCollectorwhich tracks theLastSuceededTime
work in 4 modes
-
FullSyncfromApiCollector: Note that it meansApiCollectordoesn't offer any help regarding Incremental collection, and the developer must implement the feature on his/her own. -
IncrementalSync by updatedAtfromApiCollectorStateManager: offers aids when API supports filtering records by theupdatedAtfield. -
IncrementalSync by createdAtfromApiCollectorStateManager: offers aids when API supports filtering records by thecreatedAtor returned records are sorted bycreatedAt. Useful for collecting Pipelines/Jobs that can be Finalized(Can not be re-opened) and Automated Short-lived entities (no human operation involved and it will be closed within days) -
IncrementalSync by createdAt plus refresh unfinished recordsfromApiCollectorStateManager: offers refreshing unfished records based on the previous mode. Useful for collecting PR while the API doesn't support filtering byupdatedAt
The problem: Developers must figure out how they work, the details/differences of each mode, and which one to use with what parameters.
Take the jenkins builds collector as an example:
func CollectApiBuilds(taskCtx plugin.SubTaskContext) errors.Error {
data := taskCtx.GetData().(*JenkinsTaskData)
db := taskCtx.GetDal()
collector, err := helper.NewStatefulApiCollectorForFinalizableEntity(helper.FinalizableApiCollectorArgs{
RawDataSubTaskArgs: helper.RawDataSubTaskArgs{
...
},
ApiClient: data.ApiClient,
CollectNewRecordsByList: helper.FinalizableApiCollectorListArgs{
PageSize: 100,
Concurrency: 10,
FinalizableApiCollectorCommonArgs: helper.FinalizableApiCollectorCommonArgs{
UrlTemplate: fmt.Sprintf("%sjob/%s/api/json", data.Options.JobPath, data.Options.JobName),
Query: func(reqData *helper.RequestData, createdAfter *time.Time) (url.Values, errors.Error) {
...
},
ResponseParser: func(res *http.Response) ([]json.RawMessage, errors.Error) {
...
},
},
GetCreated: func(item json.RawMessage) (time.Time, errors.Error) {
...
},
},
CollectUnfinishedDetails: &helper.FinalizableApiCollectorDetailArgs{
BuildInputIterator: func() (helper.Iterator, errors.Error) {
...
},
FinalizableApiCollectorCommonArgs: helper.FinalizableApiCollectorCommonArgs{
UrlTemplate: fmt.Sprintf("%sjob/%s/{{ .Input.Number }}/api/json?tree=number,url,result,timestamp,id,duration,estimatedDuration,building",
data.Options.JobPath, data.Options.JobName),
ResponseParser: func(res *http.Response) ([]json.RawMessage, errors.Error) {
...
},
},
},
})
if err != nil {
return err
}
return collector.Execute()
}
It is hard to just copy the code and make a new collector correctly.
- One would need to understand all collectors and all modes
- One would need to understand what can API endpoint can offer
Describe the solution you'd like
The problem can be solved by offering a document with detailed descriptions/tutorials of how to use them against different APIs, but it is a huge effort, for both author and readers.
I believe a better solution is to refactor the ApiCollector and make it Declarative:
collector := &DeclartiveApiCollector{
RawDataSubTaskArgs: ...,
ApiClient: ...,
TimeAfterFiltering: {
ByUpdateAt: {
Supported: true/false,
Via: QUERY_STRING,
KeyName: "updated_at_after"
},
ByCreatedAt: {
RecordFinalizable: true, // panic if false was given
Strategy: COLLECT_FINALIZED_RECORDS_ONLY | REFRESH_UNFINALIZED,
Via: QUERYSTRING | SORTED_RECORDS, // returned records are sorted by `createdAt`
KeyName: "",
GetCreated: func(record json.RawMessage) {...} // extract createdAt from the json
},
},
}
return collector.Execute()
This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.
This issue has been closed because it has been inactive for a long time. You can reopen it if you encounter the similar problem in the future.
This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.
This issue has been closed because it has been inactive for a long time. You can reopen it if you encounter the similar problem in the future.
This issue has been closed because it has been inactive for a long time. You can reopen it if you encounter the similar problem in the future.
This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.
This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.