go
go copied to clipboard
exp/lighthorizon: Create pubnet indices for the MVP endpoints.
Parent epic: #4317
Our target endpoint is:
GET /accounts/:id/transactions
So we need an index for each account that stores information about which ledgers the account was active in. Before we actually do this, there's an open question to answer: What kind of indices do we make?
- Should they be checkpoint-based or ledger-based? There are trade-offs. or
- Should we just create both? Then, we can compare both index size and latency differences.
The index builder is ready for parallel construction, but it might not be perfectly ready for AWS Batch. There is some preliminary work to get both steps (map and reduce) "dockerized" and running on Batch.
I'm helping take over last step here of running a range on cloud to confirm indexes are built out. Initially will build for 07/01 - 07/31 to confirm.
@Shaptic , looking into alternative k8s job for batch processing in lieu of AWS batch in short term. needs some adjustment on cluster to support JOB_COMPLETION_INDEX per https://github.com/stellar/ops/issues/1790
status update, running map/reduce jobs on k8s, identified some issues with slow performance of reduce jobs when running against s3 index. next step, adding logging into reduce, capture time spent in sections with s3 calls to identify i/o rates, determine next step for optimizations in reduce based on findings.
after discussion with @Shaptic , the partial index data loaded onto s3 will suffice for mvp criteria, therefore will move this ticket to done, and carved out new ticket for follow-up on reduce
performance, #4566