go icon indicating copy to clipboard operation
go copied to clipboard

exp/lighthorizon: Create pubnet indices for the MVP endpoints.

Open Shaptic opened this issue 1 year ago • 1 comments

Parent epic: #4317


Our target endpoint is:

GET /accounts/:id/transactions

So we need an index for each account that stores information about which ledgers the account was active in. Before we actually do this, there's an open question to answer: What kind of indices do we make?

  1. Should they be checkpoint-based or ledger-based? There are trade-offs. or
  2. Should we just create both? Then, we can compare both index size and latency differences.

The index builder is ready for parallel construction, but it might not be perfectly ready for AWS Batch. There is some preliminary work to get both steps (map and reduce) "dockerized" and running on Batch.

Shaptic avatar Jul 21 '22 22:07 Shaptic

I'm helping take over last step here of running a range on cloud to confirm indexes are built out. Initially will build for 07/01 - 07/31 to confirm.

sreuland avatar Aug 12 '22 19:08 sreuland

@Shaptic , looking into alternative k8s job for batch processing in lieu of AWS batch in short term. needs some adjustment on cluster to support JOB_COMPLETION_INDEX per https://github.com/stellar/ops/issues/1790

sreuland avatar Aug 15 '22 16:08 sreuland

status update, running map/reduce jobs on k8s, identified some issues with slow performance of reduce jobs when running against s3 index. next step, adding logging into reduce, capture time spent in sections with s3 calls to identify i/o rates, determine next step for optimizations in reduce based on findings.

sreuland avatar Aug 30 '22 17:08 sreuland

after discussion with @Shaptic , the partial index data loaded onto s3 will suffice for mvp criteria, therefore will move this ticket to done, and carved out new ticket for follow-up on reduce performance, #4566

sreuland avatar Aug 31 '22 15:08 sreuland